AMERICAN 
OURNAL OF MATHEMATICS 


FOUNDED BY THE JOHNS HOPKINS UNIVERSITY 


EDITED BY 


T. H. HILDEBRANDT H. WEYL 
UNIVERSITY OF MICHIGAN THE INSTITUTE FOR ADVANCED STUDY 


F. D. MURNAGHAN R. L. WILDER 
THE JOHNS HOPKINS UNIVERSITY UNIVERSITY OF MICHIGAN 
O. ZARISKI 
THE JOHNS HOPKINS UNIVERSITY 


WITH THE COOPERATION OF 


BELL C. R. ADAMS G. BIRKHOFF 

H. B. CURRY R. D. JAMES N. DUNFORD 

E. J. MCSHANE SAUNDERS MACLANE E. P. LANE 

S. B. MYERS GABOR SZEGO D. MONTGOMERY 
HANS RADEMACHER LEO ZIPPIN J. L. SYNGE 


PUBLISHED UNDER THE JOINT AUSPICES OF 
THE JOHNS HOPKINS UNIVERSITY 
AND 
THE AMERICAN MATHEMATICAL SOCIETY 


VOLUME 
1941 


THE JOHNS HOPKINS PRESS 
BALTIMORE, MARYLAND 
U. S. A. 


C-27 
AMERICAN "27 igg, 
JOURNAL OF MATHEMATICS 


FOUNDED BY THE JOHNS HOPKINS UNIVERSITY 


EDITED BY 


T. H. HILDEBRANDT H. WEYL 
UNIVERSITY OF MICHIGAN THE INSTITUTE FOR ADVANCED STUDY 


F. D. MURNAGHAN R. L. WILDER 
THE JOHNS HOPKINS UNIVERSITY UNIVERSITY OF MICHIGAN 
O. ZARISKI 
THE JOHNS HOPKINS UNIVERSITY 


WITH THE COOPERATION OF 


E. T. BELL Cc. R. ADAMS G. BIRKHOFF 
H. B. CURRY R. D. JAMES N. DUNFORD 

E. J. MCSHANE SAUNDERS MACLANE E.P. LANE 

S. B. MYERS GABOR SZEGO D. MONTGOMERY 
HANS RADEMACHER LEO ZIPPIN J. L. SYNGE 


PUBLISHED UNDER THE JOINT AUSPICES OF 
THE JOHNS HOPKINS UNIVERSITY 
AND 
THE AMERICAN MATHEMATICAL SOCIETY 


Volume LXIII, Number 1 
JANUARY, 1941 


THE JOHNS HOPKINS PRESS 
BALTIMORE, MARYLAND 
U. S. A. 


CONTENTS 


Fixed-point theorems for periodic transformations. By P. A. Smira, 

On the matric equation TA = BT + C. By Marx H. IncRAwAM and 
H. C. TRIMBLE, . 

On the degree of convergence of the aici an series ‘of Birkhoff, By 
W. H. McEwen, . > 

Ordered topological spaces. By 

A generalization of a metric space with applications to spaces whose 
elements are sets. By G. BALEY PRICE, . 

On representations of certain finite groups. By Eucenr P. WIGNER, 

Subspaces of spaces. By F. BOHNENBLUST, 

On generalized rings. By Davip C. Murpocu and Overam One, 

Series expansions in linear vector space. By Orrin FRINK, JR., 

On Leibniz’s definition of planes. By HEerBert BUSEMANN, 

The axis quadrics at a point of a surface. By M. L. MacQuEEn, 

A criterion for solvability by radicals. By B. W. BREWER, 

Accessibility and separation by simple closed curves. By Exon E. Berz, 

A discrete group arising in the study of differential operators. By 

The Dirichlet problem for a hyperbolic equation, By Fritz Jouy, 

On volume integral invariants of non-holonomic siege — 
By CLam J. BLACKALL, 

On the law of the iterated By Pair and Avner 
WINTNER, . 

A new derivation of the for of shidtic 
By Eric REISSNER, 

Orthogonal polynomials defined by diftetente entiabionis. By Orts B. 
LANCASTER, 

On an extremum problem in the shane. By Gy, ScEKERES, 

Explicit bounds for some functions of prime numbers. By Baanizy 
ROSSER, 


The AMERICAN JOURNAL OF MATHEMATICS will appear four times yearly. 

The subscription price of the Journau for the current volume is $7.50 (foreign 
postage 50 cents); single numbers $2.00. 

A few complete sets of the JoURNAL remain on sale. 

Papers intended for publication in the JoURNAL may be sent to any of the Editors. 

Editorial communications may be sent to Professor F. D. MuRNAGHAN at The Johns 
Hopkins University. 

Subscriptions to the JouRNAL and all business communications should be sent to 
THE JoHNS HOPKINS PRESS, BALTIMORE, MARYLAND, U.S. A. 


Entered as second-class matter at the Baltimore, Maryland, Postoffice, acceptance for mailing at special 
rate of postage provided for in Section 1108, Act of October 8, 1917, Authorized on July 8, 1918. 


PRINTED IN THE UNITED STATES OF AMERICA 
BY J. H, FURST COMPANY, BALTIMORE, MARYLAND 


PAGE 
1 
9 
29 
39 
46 
BY 
64 
73 
101 
112 
119 
127 
136 
141 
155 
169 
177 
185 
208 
211 | 


i 
u 
if 
e 
oO 
b 
as 
a 
tl 
8} 
n 
p 
S} 
pl 
A 
L 
ti 
th 
A 
th 
ha 
as 


FIXED-POINT THEOREMS FOR PERIODIC TRANSFORMATIONS.* 


By P. A. SmirH. 


The simplest fixed-point theorems for transformations of finite period 
seem to be those which assert that fixed points must exist if the space M 
under transformation is simply connected in some specified sense. In the first 
theorem of this sort, obtained by the author [2] in 1934, M was taken to be a 
subset of euclidean n-space in which all singular spheres of dimensions not 
exceeding pn — n — 1 could be contracted to points; p here denotes the period 
of the transformation and is assumed to be prime.’ It has recently been shown 


by S. Eilenberg * that these assumptions of homotopy could be replaced by 


assumptions of homology; it is still required, however, that M be a subset of 
a euclidean space and that p be a prime. We propose now to contribute to 
these results first, by allowing spaces which are not immersible in euclidean 
spaces, and secondly, by allowing periods which are powers of primes. Our 
methods of proof are quite different from the earlier ones and, we believe, 
somewhat simpler—at least as concerns transformations of prime period. 
With regard to transformations of perfectly arbitrary periods it remains an 
interesting open question whether or not they must, in general, admit fixed 
points assuming even that the space under transformation is a euclidean n- 
space. We have been able to answer in the affirmative only when n S 3 and, 


for suitably regular transformations, when n = 4 (sections 5, 6). 


1, Preliminaries. Let kK be a finite simplicial complex which is sim- 
plicially transformed into itself by a homeomorphism T of period p, p a prime. 
Assume 7’ to be such that the invariant simplexes form a sub-complex K°. 
Let g be a coefficient group for chains and homologies in K. The transforma- 
tion T induces a transformation of the chains of A which is permutable with 
the boundary operator A. This is perfectly clear for chains of dimension > 0, 
but a word should perhaps be said concerning 0-chains. Let E be a vertex. 
Associated with 2 are two oriented 0-simplexes, namely + 2, — F (ordinarily 
the + is not written). Similarly, with TF are associated + (TE), — (TE). 
We shall agree that 7’ transforms oriented 0-simplexes according to the rule 


* Received April 2, 1940. 

* Although the hypothesis that p be prime was not specifically stated in [2], we 
have been aware for some time that the argument is not valid for composite p. The 
assertion in the eighth line of section 1 holds only if p is a prime. 

* Duke Journal 6 (1940), pp. 428-437. 


4 

1 

| 


P. A. SMITH. 


T (eH) =e (TE) where «e=-+ or —. It is easy to see that with this con- 
vention A, J are permutable without exception. With regard to cycles, we 
shall agree that a 0-chain is a 0-cycle if and only if the sum of its coefficients 
is zero. Thus H —TE is a cycle but + F£ is not. 


Let 


We shall use the symbol p to stand for o or 6. Having agreed in a given dis- 
cussion which of these operators is to be p, the other will be denoted by p. 
We shall say that a chain X is of type p if it is expressible in the form pY. 
The null chain is to be regarded as of both possible types. Jn order that a 
chain X in K — K° be of type p it is necessary and sufficient that pX = 0. 
The necessity of the condition follows from the obvious relations pp = pp = 0. 
The proof of sufficiency is less immediate but perfectly straightforward (see 
[3], p. 142). 

Suppose C is a cycle of type p. If, modulo K°, C is the boundary of a 
chain of the same type, we shall write C~0 mod K°®. Suppose that Ch, Cis 


are cycles of types p and p respectively. If there exists a chain , such that 


we shall write Cn: Ch-1. 


2. Let it now be understood that the coefficient group g is p, the group 
of integers reduced mod p. Then if Y° is a chain in K° we have pY° = 0; 
for, a simplex appearing in Y° with the coefficient x will appear in 8Y° with 
the coefficient 0 and in oY° with the coefficient pr (0 mod p). It follows 
from this that every chain of type p lies in K — K°. For, a chain Y can always 
be expressed in the form Y’+ Y° where Y°C K®*, Y’C K—K®. Then 
pY = pY’C K— 

Lemma 1. Let Cn, Cr. be cycles of types p, p such that Cr: Cr1. If 


Cr 0 mod K®, then also ~ 0 mod K°. 


Proof. Assuming that Cr, ~0 mod K°®, there exist chains Vai, Yn, X° 
such that 
(1) Cr=pXn, = = Ci + X, 
We may write 


Then = Cn+ pZ + pZ°. Since pZ°=0, it follows from the third 


__| 


FIXED-POINT THEOREMS FOR PERIODIC TRANSFORMATIONS. 3° 


relation in (1) that pZ = X°. But since pZ C K — K°, we have pZ = 0, so 
that Z is of type p. Hence if we operate on both sides of (2) with A we find 


that Cy, 0 mod K°. 


LemMa 2. Let bea vertex of K. If K° =0, the cycle = E—TE 


cannot be = 0. 


Proof. Suppose on the contrary that there exists a chain X such that 


ASX =8E. Let 
(1) 


Then 67 = 0 and therefore, since K° = 0, Z is of type o, say Z=oW. Con- 
sequently the sum of the coefficients of Z is zero (modulo p) and Z is a cycle. 
Since A.V is a cycle it follows from (1) that — F is a cycle, which is impos- 


sible (see section 1). 


3. The fixed-point theorem. A space from now on will mean a Haus- 
dorff space. Let M be a locally bicompact space. We shall say that M is 
acyclic mod p if for every bicompact set A there exists a bicompact set A’ 0 A 
such that relative to the coefficient group p, cycles in A are ~ 0 in A’. Cycles 


and homologies are to be understood in the sense of Cech [1]. 


THEOREM I(a). Let p be a prime and M a finite dimensional locally 
bicompact space which is acyclic mod p. Every homeomorphic transformation 
of period p* (« > 0) of M into itself admits at least one fixed point. 


Proof of theorem I(1). Let T be a transformation of period p operating 
in VW. Finite dimensionality means that there exists an n such that every 
covering of MV (i.e. finite covering by open sets) has a refinement whose nerve 
is of dimension = n. Observe that if B is a bicompact set, oB is an invariant 
bicompact set containing B. This makes it clear that there can be chosen 


pn + n+ 1 invariant bicompact sets Ao, such that 
- (m = pn + p) 


and such that cycles in A; are ~0 in Aj, Let N=A,. We shall regard 
N as a bicompact subspace of M. T induces a transformation of period p of 
N into itself and it will be sufficient to show that this transformation (to be 
denoted also by 7’) admits a fixed point. It is easy to see that in the topology 
of N, as in that of M, cycles in Aj are ~ 0 in Aix, (1 << m). 

Assume now that 7 admits no fixed point in N. Let us call a covering 
Ul of N regular if (1) its vertices (i.e. component sets) are permuted among 


— 


4 P. A. SMITH. 


themselves by 7’; (2) no vertex meets any of its images under 7, T?,- - -,T?*; 
(3) the dimension of the complex (i.e. nerve of) U is less than m. As a 
result of the second condition, the simplicial transformation induced by T 
in the complex U1 admits no invariant simplex. We assert that there exist 
“arbitrarily fine” regular coverings. This is true because (a) like M, N 
admits arbitrarily fine coverings of dimension S n; (b) if B is a covering of N 
of dimension = n, the covering U obtained by superimposing T¥,- - -, T?7B 
on ¥% is of dimension = pn + p—1< _™m and satisfies (1); since N is bi- 
compact, Ut can be made arbitrarily fine by taking % sufficiently fine; (c) since 
points in N can be separated by open sets, U can be made to satisfy (2) by 
taking & sufficiently fine. Thus the totality U of regular coverings is a com- 
plete system, and can serve for defining homology relations in J. 

Consider a definite regular covering U. If h, k are given integers, there 
exists a regular refinement UW’ of 11 such that the U-cycle obtained by projecting 
into U1 an h-dimensional 1’-cycle in A; will be the U-coordinate of a complete 
cycle in A; and will therefore be ~ 0 in Ax,, (see [1]). Consequently, if Un 
is an arbitrarily chosen regular covering, there exist regular coverings Um-1, 
such that Un Un. and such that if m is a pro- 
jection m) and C; an i-dimensional Uj-cycle in Aj, then 
iC; ~ 0 in Aj,,. The particular choice of 7; is immaterial; let 7; be chosen 
in such a way that it will be permutable with 7, hence with p and p. It is easy 
to see that such “ invariant” projections exist (see [3], p. 138). 

Let po, stand alternately for and o starting with pp» Let 
Xo be a Up-vertex in Ay. Since A, is invariant, TX, is also in Ay. Hence 
poXo is a O-dimensional cycle in and therefore in Ay, say 
= AX,, X,C Then p,X; is a cycle because 


ApiX, pipo™1A = (), 


Since p,X,C A,, we have 2:p,X,~0 in Ao, say mpiX,—AX,. Continuing 
in this manner we obtain chains Xo,- - +, Xm such that 


(1) AX = wipir; (1 = 0, - 
Let z» be the identical projection of U,, into itself and let 
Ci XG (1 = 0,- »m). 


Then Om, Om+,° * * are Un-cycles of types pm, pm-s,* * * respectively and as a 


consequence of (1), 


Ce. 


or 


FIXED-POINT THEOREMS FOR PERIODIC TRANSFORMATIONS. 


The cycle Cy is of the form 8H where FZ is a U»m-vertex. Since dim Un < m, 
we have Cm = 0 ~ 0 and therefore by lemma 1 (with = 0) we have = 
8E = 0 which, by lemma 2, is impossible. 


4. Before proving theorem I(«) with a >1 it will be necessary to 
examine more closely the nature of the fixed-point set in the case « = 1. 


THEOREM II. The totality L of fixed points which theorem I(1) asserts 
to be non-empty, is acyclic modulo p. 


Proof. Let B be a bicompact set containing points of Z. It will be 
sufficient to prove that there exists a bicompact set B’ D B such that cycles in 
BL are ~0 in B’L. Consider the sets Ao,- - -,Am in the proof of I(1). 
Obviously Ay could have been any non-empty invariant bicompact set and 
therefore we may now suppose that Ap —oB. Let B’ be a bounded open set 
containing Am. We shall show that for B’ we may take the set Am 

Let V be a covering of the space N (—A,») with dim 8 =n, and let 
U, as above, be the union of B, T¥,---,7?*¥. Then although dim US 
pn + p=1, U cannot be regular; it will in fact have at least one invariant 
simplex since N now contains fixed points. It is essential for our purposes 
that U be modified in such a way that its invariant simplexes will le in the 
fixed-point set Ly (=LN). This modification can in fact be carried out. 
More precisely there exist coverings Ul of N such that (1) the simplexes of U 
are permuted among themselves by 7’; (2) the invariant simplexes of U1 form 
a subcomplex 11°; the simplexes of 11° and these only, are in Ly; (4) the sim- 
plexes of 11 — 11° are of dimension S pn + p—1. The totality {11} of these 
special coverings is a complete system.* 

We now examine the homology relations in the space N and show that 
every cycle in LyB is ~0 in Ly. Since dim Ly Sn, we need only consider 
cycles of dimension =n. Among the special coverings, let there be chosen 
Uo,: - +, Um defined exactly as in the proof of I(1) and let zi, pi be also 
defined as before. Now let ya. be a cycle in LyB with h—1n. Since 
BLy © Ani, we have y~0 in Ay; in particular we may write 


yn-1 (Wn) = (Xn = C Ax). 


Since the simplexes of y(1n), being in Ly, are in 1°,, we have pa (Un) =0. 
Therefore p,X;, is a cycle in Ayn. Hence map»nXn~0 in Any. This is the first 


8 The detailed construction of the l]’s will become clear from an examination of a 
similar construction described in detail in [3], pp. 132-138. 


| 
| 


6 P. A. SMITH. 


step in the building-up process described above which leads, in Un, to the 


relations 


me 


Since Cm is of the form pmZ, it is in Un—U°, (section 2). But since all 
simplexes of 11,,— U°,, are of dimension = pn + n—1 < m, we have Cy, = 
0 ~ 0 mod and hence by lemma 1, ~ 0 mod U°. Let y’ = am: any. 
From the definition of (Cech) eycle, y’ (Un) ~y(Un) in BLy. Let 


Ph> ° Tha. Then 


pX’ = 0 mod Un; AX’ = 7’. 


The first of these relations implies the existence of U,-chains Y°, Y such that 
(1) (x°CW,,). 


Let Z° be that subchain of AY — Y’ — Y° which is in U°,, and Z the remainder. 
Then 


If we operate on both sides of (2) by p and take into account the relation (1) 
and the fact that pA° = pZ° = 0, we find that pZ =0. Hence by lemma 1 


we may write Z =p. If we insert this into (2) and then operate on both 


sides of (2) by A, we have 
= (y’ + AX° + + ApW. 


The chain in parenthesis is in U°,,, whereas ApW is in Uyp—U°». Conse- 
quently both chains are null and y’ is therefore the boundary of a chain in Wn. 
By the properties of special coverings, chains in U°,, are in Ly. Hence 


y(Un) ~ y(n) ~ 0 in Ly. Since U,, is an arbitrary special covering, we 


have {y(Un)} =y~0 in Ly. 

We now return to the space VW. Let T be a cycle in BL. Then since 
BC A», T may evidently be regarded as being identical with a cycle which 
belongs to the space NV. This cycle is in BLy and is therefore homologous to 
zero in Ly. It is not difficult to see. then, that [~0O in NU, and therefore, 
since N = 1,,— B’, T~ 0 in B’L, which completes the proof, 


5. Proof of Theorem I(a) with «>1. Assume the truth of I(8) for 
B<«. Let T be a transformation of period p* operating in M and let L’ be 
the totality of points which are invariant under T%, q = p**. Since T? is of 
period p, L’ is non-empty by theorem I(1) and acyclic mod p by theorem IT. 


FIXED-POINT THEOREMS FOR PERIODIC TRANSFORMATIONS. 


Moreover, L’ is transformed into itself by T. The transformation induced in 
L’ by T is the identity or else is periodic and of period p® where B < a. Hence 


T admits at least one fixed point in L’. 


THEOREM III. The hypothesis of finite dimensionality in theorems I (a) 
and II can be omitted if M is bicompact. 


The proof of theorem I(1) depends on the existence in a certain complex 
Un without fixed elements, of a sequence of cycles Co, Cy,° + +, terminating 
in a cycle of dimension greater than dim U,. Suppose VM is bicompact but not 
necessarily finite dimensional. Then, assuming that T has no fixed point, 
there exists a complete family {11} of invariant coverings without fixed sim- 
plexes (although the dimensions of the Ws are not necessarily bounded). 
Moreover there can be described a construction whereby in any given 1 there 


Y 


can be built a sequence Co, C;,: + +, terminating with any desired dimension, 
in particular with a dimension greater than dim U,. For, to say that M is 
acvelic now means simply that every evele in WM is ~ 0. Let m=1-+ dimU 
and let there be chosen refinements = Uy, Uy such that 
the projection into 1j,, of i-eveles in UW; are ~0. Then we construct the 
desired cycles precisely as in I(1) except that now Aj = M 
(i=0,:-+,m). Similar remarks apply to theorems IT and I(@), 1. 


IV. Hvery periodic homeomorphic transformation of euclidean 
3-space into ilself admits at least one fixed point. 


? 


Proof. Let FE. be converted into a 3-sphere 77, by the addition of a single 
point co. .\ periodic transformation operating in / induces a periodic trans- 
formation which operates in H, and leaves oo fixed. It will be sufficient to 
show that the transformation of 7, admits at least one fixed point different 
from co. This however follows from the fact that the fixed-point set of a 
periodic transformation operating in a 3-sphere, when it is not empty, consists 


of two points or else is homeomorphic to a circle or a 2-sphere [4]. 


6. We shall say that a transformation 7 of period q operating in a 
locally euclidean space is regular if those fixed-point sets of T?,- + +, 7 
which are not empty are locally euclidean. It can be shown for example that 


if T is locally analytic, it is regular. 

THEOREM V. A regular periodic homeomorphic transformation of eu- 
clidean 4-space into itself admits at least one fixed point. 

Proof. Wet Ey, be converted into a 4-sphere by the addition of a single 
point co. Let g be the period of a regular transformation 7 operating in Fy. 


8 P. A. SMITH. 


We may suppose that q is not a power of 2 (theorem I(«)). Then q = ps, 
say, where p is an odd prime. Denoting also by 7 the transformation which 
is induced in H,, let L’ be the fixed point set of 7* in Hy. L’ is not empty 
since it contains 0. Since 7* is of prime period p, L’ has the same modulo p 
Betti numbers as an r-sphere of 0, 1, 2 or 3 dimensions (a 0-sphere being a 
pair of points) ([3], page 160). Now r can be 3 only if 7* is of period 2 
({3], page 157) which is not the case. Thus L’, being locally euclidean, is 
either a pair of points or a simple closed curve or a closed surface with the 
mod p Betti numbers of a 2-sphere. In this last case, L’ would actually be 
homeomorphic to a 2-sphere. In any case L’ is transformed into itself by T 
and the transformation induced in L’ is either the identity or else is periodic 
and leaves o fixed. The fixed-point set in L’ is therefore a pair of points or 
a simple closed curve or a set of points, homeomorphic to a 2-sphere. Hence 


it contains at least one point other than <x. 


REFERENCES. 


1. E. Cech, “ Théorie générale de ’homologie dans un espace quelconque,” Fundamenta 


Mathematicae, vol. 18 (1932), pp. 149-183. 
. A. Smith, “A theorem on fixed points for periodic transformations,” Annals of 


to 


Mathematics, vol. 35 (1934), pp. 572-578. 

3. P. A. Smith, “Transformations of finite period,” Annals of Mathematics, vol. 39 
(1938), pp. 127-163. 

4. P. A. Smith, “ Transformations of finite period, II,’”’ Annals of Mathematics, vol. 40 
(1939), pp. 690-711. 


THE INSTITUTE FOR ADVANCED STUDY AND 
BARNARD COLLEGE, COLUMBIA UNIVERSITY. 


— 


ON THE MATRIC EQUATION TA = BT + C.* 


By Mark H. INGRAHAM and H. C. TRIMBLE. 


The purpose of this paper is doubly two-fold. It first deals rationally 


with the matrix equation 


(1) TA =BT+C, 


as an equation in the unknown matrix T. By finding the maximum rank of 
T’ when C = 0, a simple treatment of the similarity problem is given. Con- 
sideration of this equation for the case A = B, C =0 leads naturally to an 
isomorphism noted by P. L. Trump,’ N. Jacobson? and others. This iso- 
morphism is used to give rationally a simple treatment of certain- problems 
connected with the ring of matrices commutative with a given matrix. More- 
over, the work is carried through for matrices whose elements belong to a 
division algebra of finite order over its centrum so that not only are certain 
classical results for fields simplified and extended but the theory is generalized 
to this important non-commutative case. 

It has been difficult for the authors to decide on the proper mode of 
presentation of the theory. A treatment can be given that unifies both cases, 
but this is definitely less simple than that for the commutative case alone. 
Another possible method would be to present the commutative case as a unit 
first and then the non-commutative case. This results in much duplication. 
It was therefore decided to write the treatment of each section for the com- 
mutative case and then add a sub-section to point out the alterations that are 
necessary to extend the treatment to the case of elements belonging to a 
division algebra. In some sections this takes only a few sentences; in others, 
a somewhat more elaborate treatment is necessary. It is hoped that by 
omitting all sub-sections headed “non-commutative case” those interested 
chiefly in the classical case can secure a unified treatment of the theory for 
matrices with elements in a field, while those interested in the general case 
will by this treatment find a sharper analysis of the difference between the 
eases than by other methods of presentation. 

It should be borne in mind that “ commutative case ” and “ non-commu- 
tative case” refer to the case where the fundamental number system from 


* Received February 13, 1939; Revised April 15, 1940. 
1Trump [9] (p. 376). 
2 Jacobson [2] (p. 502). 


10 M. H. INGRAHAM AND H. C. TRIMBLE. 


which the elements of the matrices are chosen forms a field or a division algebra 
of finite order respectively. 

Though both authors collaborated throughout, Sections I-V are chiefly 
the work of Mr. Ingraham while the results of Section VI are, in the main, 
due to Mr. Trimble and form a portion of his doctoral dissertation. The 
authors wish to acknowledge with thanks the suggestions and help of J. H. 
Bell, C. J. Everett, Jr., and G. W. Whaples.* 


I. Introductory theory. 


1. Commutative case. In a previous paper by M. H. Ingraham and M. 
C. Wolf * certain theory was developed which will be used throughout this paper, 
part of which will be given in outline here for the convenience of the reader. 
Somewhat more general results couched in terms of ideal theory, were secured 
by Jacobson.® Let A be a square n X n matrix with elements in a field A, 
and €,, és, - -,& form a set of vectors (n X 1 matrices) with elements in KX. 
The vectors €, &,° + -,& are said to be linearly independent relative to A if, 
whenever a set of polynomials g; in A is such that 3gi(4A)é; = 0. it follows 
that gi(A)é; 0. The linear extension, relative to A of a set of vectors 
+, &, denoted by La(&, &,° -,&), is the totality of vectors of the 


29 2 


C1s $25 
form 3gi(A)éi, where the g; are polynomials in K. Such a set is said to be 
linear relative to A and obviously is linear with respect to coefficients in K. 
If é,, &,° - -,& are linearly independent relative to A, they are said to form 
a proper base relative to A for their linear extension relative to A. 

If € is any vector, there exists one and only one polynomial g of minimal 
degree, with leading coefficient unity, such that g(A4)€=0. This polynomial 
is said to be minimally associated with é relative to 4; it divides every poly- 
nomial f associated with €, i. e., every polynomial f such that f(A)é = 0. 

It can be shown that, relative to A, a proper base (&,,&,- - -,&) for the 
total vector space V may be found. Let g; be minimally associated with é; 
relative to 4. A complete set of invariants for the system of such proper bases 
is given by 

THEOREM 1.° [f g is irreducible, the number of g; divisible by any power 
g' of g is the same for all proper bases of V. 


’ The work of Mr. Trimble and Mr. Everett has been made possible by grants from 
the Research Committee of the University of Wisconsin. 

Ingraham, Wolf [1]. 

5 Jacobson [2]. 

® Ingraham, Wolf [1] (p. 20). 


ON THE MATRIC EQUATION 7A = BT + U. 11 


The €; may be so chosen that the g; are the invariant factors, the charac- 
teristic divisors, or the elementary divisors of A. 
We shall denote the degree of any polynomial g by <g>. The gi of 


the preceding paragraph satisfy the relation = < gi > =n. 


2. The non-commutative case. If the elements of A are not in a field 
but are in a division algebra D of finite order over its centrum C, the theory 
is somewhat more complicated and is based in part on the work of O. Ore? 
concerning non-commutative polynomials. If g =  A‘a; is a polynomial with 
coefficients in D, A a matrix, and é a vector with elements in D, then g(A) © é 
is defined to be SA‘éa;.. The products g(A) © T and g; © gz are defined in a 
similar manner. The polynomial g; © gz in a commutative indeterminate is the 
Jog, as usually defined. An immediate extension of the notions § of relative 
linear extensions and relative linear independence to the ©-process are avail- 
able, as well as the idea of minimal association. 

If g = 9: © ge, gz is said to be an interior factor of g. If g is any poly- 
nomial, then there exists a polynomial / of minimum degree with coefficients 
in the centrum and leading coefficient unity, such that h may be expressed in 
the form f © g, where f is a polynomial. This polynomial h divides all other 
polynomials with coefficients in the centrum having g as an interior factor. 
We say that g defines h. If h is defined by g, but by no proper factor of g:, 
and if h is defined by gz but by no proper factor of go, then < g; > = < go>, 
and this number, denoted by << h >>, is said to be the reduced degree of h. 
If g is irreducible and defines h, then / is irreducible in the centrum, and 
conversely, if 4 is irreducible in the centrum, is defined by g, and << g >= 
<<h>>, then g is irreducible. 

The following three theorems ® are restated for the convenience of the 
reader. 

THEOREM 2. Jf h is an irreducible polynomial, and if g; is a polynomial 
of minimum degree defining h, there exists a set of transforms gi (t= 
1,2,:--.¢) of g, by the basal elements a, such that gt O gt1O°°'OMn 


s&s 


is a polynomial of minimum degree defining ht, and hence << ht>>= 


<<" >>. 
Let J? 


TuroremM 3. Jf U isa relative linear set and if h is minimally associated 


with a vector € mod U over the centrum C relative to M and if g defines h, 


[5], [6]. 
Ingraham, Wolf [1] (p. 22). 
® Ingraham, Wolf [1], Theorems 25, 21, and 26. 


12 M. H. INGRAHAM AND H. C. TRIMBLE. 


but no interior factor of g of lower degree than the degree of g defines h, then 
there exists a vector in Ly(€é) such that: (1) g is minimally associated with 
n mod U, and (2) there exists a polynomial p such that g(M)© n= p(M)O 
h(M)© 


THEOREM 4. If &, &,- + -,& 1s a set of vectors linearly independent 
relative to M, such that Ly(&, €2,- + -,&) is the whole space, and if gi is the 


minimum polynomial associated with & relative to M, then the rank of h(M), 
k 

where h is any polynomial over the centrum, is equal to m —3 < gui >, where 
i=1 


m is the order of M and gyi = (h, gi)ex, the greatest common exterior divisor 


of h and gi. 


From these it follows that, relative to any square matrix M, a proper base 
&:,° * °,& of the total vector space, V, may be found such that é; is minimally 
associated with g;‘'*), where is irreducible. 

It may be shown *° that this may be done in such a way that gi‘ (M)O & 
may be written in the form pi:(M)© hi? (M)© &, where hj is the polynomial 
defined by gi. Of course this may be done constructively only in algebras D 
for which there exists a constructive process for factoring a polynomial into 


irreducible factors. 


II. The equation TA — BT + C. 


1. Commutative case. D. E. Rutherford '! considered this equation 
getting a complete solution, but his method involves the necessity of using the 
characteristic values of A and B. This equation was also studied by R. 
Weitzenboéck,'? who showed the existence of what he calls a “ reine ” solution. 
The following gives a rational construction for the complete rational solution. 

Consider the equation TA = BT + C, where A is a square n X n matrix, 
B a square m X m matrix, and T and ( m X n matrices. 

Let & (i=—1,--+,,) form a proper base relative to A for the total 
n-space, V,, and let €; be minimally associated with 4. 

Let m7 (t=—1,:--,k.) form a proper base for the total m-space, V2, 
relative to B, and let yi be minimally associated with fy. 

The matrix 7 will satisfy equation (1) if and only if 


(2) TAE= (BT +C)é 


for every € of a set of n linearly independent vectors of order n. We may 


10 Ingraham, Wolf [1] (p. 28). 
11 Rutherford [7]. 
12 Weitzenbéck [11]. 


( 
é 


ON THE MATRIC EQUATION TA = BT + U. 13 


choose as such a set A’é; (C= 1,:--,k,, gi >—1) since 
any linear relation between these would negate the hypothesis that £2,° 
were linearly independent relative to A. 
From (1) it follows that 
TA? = BT + BC+ CA, 
and in general, 
TA’ = BT + + Br°CA +: BCA"? + CAr 
(rani, 3, + +). 
Call 
Do(at: B, A) = BC + +--+ -+ BCAT? + CAM, 
and in particular 
Do(rA1: B, A) =C and Deo(d°: B, A) = 0. 
We may then write 
(3) TAt = BT + B, A). 
If g = SA‘ai, we define Deo(g: B.A) to be TDo(rA‘: B, A)a;. When no am- 
biguity exists, we will denote this merely by D(g). From (3) it follows that 
(4) Tg(A) = 9(B)T + D(Q). 
Let T& =i. Since gi(A)é& = 0, it follows from (4) that 
O=Tgi(A)&i = + D(gidé. 
Hence the &; satisfy the equations 
(5) gi(B)&i = — D (gi). 
Moreover, if we have a set {; satisfying (5), we may determine 7’ by setting 
TA‘E, = BE, + D(A) (tan >—1). 


so that 
(6) (TA)A™€, = (BT + C)A™& 
Since equations (5) guarantee that 

we see by using equation (6) that 7 satisfies equations (2) for the set of 
vectors A'é; (t= >—1). 

The problem is therefore reduced to the determination of ¢; satisfying 
equations (5). Since the 4; form a proper base for V. relative to B, we may 


write 


14 M. H. INGRAHAM AND H. C. TRIMBLE. 


— D = (B) nj, 
where the qi; are polynomials over K. Let 
Cy == (B )njs 


where the 2;; are polynomials over K. It follows from equations (5) that the 


xij must satisfy the equations 
gi (B) zi; ( B )nj = Gis ( B) nj (4, 4), 


which is equivalent to saying 


(7 ) Jitij = qi mod fis 


a system of congruences whose solution is well known. 

If 7, satisfies equation (1) and if 7 is any other solution, then S = 
T — T, satisfies the equation 
(8) SA = BS. 


For this, equations (7) reduce to 


(9) gitij =0 mod fj. 


Hence 24; = Yi; where the y;; mav be reduced modulo the greatest 
Yi: Yi, 


fi 
(gis fi) 
common divisors (gi, f;) of gi and f;. From this it is clear that the number 
of linearly independent solutions of equation (8) is 3i; <(gi, fj) >, which, 
since the g; and f; may be taken to be the invariant factors of A and B 
respectively, gives the well known ** result that the number of independent 
solutions of equation (8) is the sum of the degrees of the greatest common 
divisors of the invariant factors of A and B taken for all possible pairs. This 


also determines the number of independent solutions of equation (1). 


2. Non-commutative case. This section up through equation (9) may 
be applied to the non-commutative case by merely replacing multiplication 
by the ©-process. 

We will indicate the nature of the solution of an equation of the type of 
(7%). This is chiefly based on the work of Ore. 

Consider the congruence 
(10) g©x=q mod f. 


Ore ™* has shown that the solution of this congruence may be made to depend 
on finding solutions of the two congruences 

18 MacDuffee [4] (p. 90). 

14 Ore [5] (p. 253). Though this work is preceded by certain postulates that limit 
the application to “‘ differential polynomials,” nevertheless, as the author points out on 
p- 236, it may be applied to a non-commutative ring with an euclidean algorithm, and 


hence to this case. 


ON THE MATRIC EQUATION 7A = BT + C. 15 


(11) g © x=0 mod f 
and 
g © =4q mod p 


where f= p© (q,f) and the polynomial q; is defined by the method of 
reduction. In other words, the solutions may be made to depend on equations 
of type (11), and on equations of type (10) where q and f are relatively prime. 
It is clear, moreover, that any two solutions of (10) differ by a solution of (11). 

Let [g:, g2] be the least common exterior multiple of g, and gs and let d 
be defined by the equation [q¢,f] =d © gq. Ore™ proves that a necessary and 
sufficient condition for the existence of a solution of equation (10), under the 
condition stated above, is that there exists a polynomial f, relatively prime to 
g such that 


If it can be shown'® that the solutions of (11) are 
those polynomials « which in the sense of Ore transform f into a divisor of g. 
Of course the trivial solution 0 always exists. It is clear that, if h has 
coefficients in the centrum and if z is a solution of (11), h © z is also a solu- 
tion. If there is a non-trivial solution, then a set of solutions 2 1,° + + %ox may 
be found which are linearly independent as to coefficients in the ring of poly- 
nomials over the centrum and such that every solution is of the form Shi © 2oi 
where the h; are polynomials over the centrum. The number & need not exceed 
the order of the algebra D over its centrum (’. Let 2; be a non-trivial solution 
of minimum degree. Let 22 be a solution of minimum degree, linearly 
independent of 2»;, and in general let xo; be a solution of minimum degree 
linearly independent of 21,° * * @oi-1 as to coefficients in the ring of poly 
nomials over the centrum. The leading coefficient of a); must be linearly 
independent as to coefficients in the centrum of the leading coefficients of 
tors" * *Xoi-1; for if not there would exist polynomials of the form hj =A™c; 
such that c; are in the centrum and 2; — 3,**hjxo; is of lower degree than 
2 i, in contradiction to minimal hypothesis concerning the degree of Zoi. 
Hence *& is not greater than the order of D over C. 

It is easily seen that if the polynomial hy, defined by g, and the polynomial 
h;, defined by f, are relatively prime, then (11) has no solution other than 


~ 


*==0 mod f. 

To illustrate the above let D be the system of rational quaternions. Here, 
the congruence (A+ 1) © x=0 mod A+ i has only the trivial solution 

18 Ore [5] (pp. 235-236). 
16 Ore [6] (p. 489, Theorem 11). 


16 M. H. INGRAHAM AND H. C. TRIMBLE. 


The congruence (A+ i) © x=0 mod A+ has the two linearly inde- 
pendent solutions 1 and i; the congruence (A+1) © x=0 mod (A+1) 
has the four solutions 1, i, 7, k; and the congruence (A+ 1) © (A—i) Ore 
0 mod (A+ 1) © (A+ 7) has as a fundamental set of solutions j, A + 4, 
iA — 1. 


III. Characteristic divisors of a matrix for relative linear subspaces, 
and for relative linear spaces modulo a relative linear subspace. 


1. Commutative case. If V, is a linear subspace relative to a n Xn 
matrix M of the total n-space V, then M defines a transformation on V, to V; 
and this transformation has associated with it various invariants such as 
invariant factors, characteristic divisors, elementary divisors, etc. Denote 
these as the invariant factors of M on V,, characteristic divisors of M on V,, 
etc. If the equalities arising in such definitions are replaced by congruences 
modulo a subspace V2 of V, linear relative to 1, then we may speak of the 
various invariants of WM on V, modulo V2. 

The following discussion leads to an interesting treatment of the nature 


of the solution of the equation TA = BT. 


THEOREM 5. If U is a subset of V, where U and V are linear relative 
to M, then the characteristic divisors of M on U are divisors of the corre- 


sponding characteristic divisors of M on V. 
A corollary of this is 


THEOREM 6. If U isa relative linear subspace of the total vector space, 
then the characteristic divisors of M on U are divisors of the characteristic 
divisors of M. 


It is clearly sufficient to prove Theorem 5 for the case where the charac- 
teristic divisors of M are powers of one irreducible polynomial h. Let 
these be h™, h™,- - -,h*, and let the characteristic divisors of WM on U be 
hw, where = Visi, Ui = 

Let V; and U; be the linear spaces in the total vector space V and U 
respectively which are orthogonal to h'. Uj, is the intersection of V)_, and 
U;. Vi contains both U; and V7-;. 

Hence order V; = order U(V1i-4, Ui) 


= order V;_, + order U; — order 


and 


order V; — order V;_,; = order U; — order U-}. 


ON THE MATRIC EQUATION TA = BT + C. 1% 


The left-hand side of this equation is the number of v; = 1 and the right-hand 
side the number of uj =1. Theorem 5 follows at once. 


THEOREM 7. Jf U is a subset of V, where U and V are linear relative 
to M, then the characteristic divisors of M on V modulo U are divisors of the 


characteristic divisors of M on V. 


It is sufficient to prove this for the case where the characteristic divisors 
are all powers of a single irreducible polynomial h. Let oi, 02,- - ox form a 
proper base for V relative to M and be minimally associated with the charac- 
teristic divisors h*, h8,- + -,h8*, We proceed to form relative to M a proper 
hase T2,° °°, 7, for V mod U. Let 7, be oa, where op, is minimally asso- 
ciated mod U with the highest power h' of h which is minimally associated 
mod U with any of the o. Let 72, be op, where op, is minimally associated mod 
(U + Imu(7:)) with the highest power h* of h which is minimally associated 
mod (U + Ly(r:)) with any of the o. Choose 7. to be a linear combination 
relative to WM of +, and r2,, such that 72; is in Ly(71, r2) and such that 7, and 
r, are linearly independent relative to M mod U. In general, let ri: be op,, 


where gp, is minimally associated mod (U + Du(7.° * + ti-1)) with the highest 
power of h which is minimally associated mod (U + La 


with any of the o, and form 7; a linear combination relative to .V of ri, and 
such that 74, is in +71) and are linearly inde- 
pendent relative to .V mod U. The existence of such 7; was shown by Ingraham 
and Wolf.17 This may be continued until a base relative to WV for V mod U 
is obtained. From this construction it follows that 1) 7, .7x, form a 
proper base relative to M for V mod U; 2) h®%, +, ht are the charac- 
teristic divisors of M mod U; 3) t;Ss,3 4) ti 2 tis. Hence the number 
of ¢; greater than any integer / is equal to or less than the number of s; greater 
than 7 and hence s; = t; and Theorem 7 follows. 

We may of course add that if U,, Us. are subsets of V, where U,, U2 and 
V are linear relative to WV, and U, contains U.. then the characteristic divisors 


of M on V mod U. are divisors of the characteristic divisors of Zon V mod U2. 


THEOREM 8. Jf the polynomials hy, + are divisors of the 
characteristic divisors hy, hs, + + .hy of M on V, where V is linear relative 
lo M, there exist subsets V, and U, of V such that V, and U, are linear relative 
to M, and such that the characteristic divisors of M on V, are equal to 
the characteristic divisors of M on V mod U,, and are hay, hys,- + +, hax. 


Let (01, o2,° * *,o%) be a proper base for V relative to M and let o; 
17 Ineraham, Wolf [1] Part I. 


9 


18 M. H. INGRAHAM AND H. C. TRIMBLE. 


be minimally associated with h; relative to M. Let hi =heih,. Then 
hex(M)ox] is effective as V;, and 
hi2(M)o2,- hix(M)ox] is effective as Uj. 

Hence at once in light of Theorems 5 and 7 there is an automorphism of 
the subsets linear relative to M of a space V linear relative to M such that if 
V,~ U,, then the characteristic divisors of M on V, equal the characteristic 
divisors of 1 on V mod U,;. This automorphism is not necessarily 1< 1. 


2. Non-commutative case. In the proof of Theorem 5 for the non- 
commutative case the order of V; can be shown to be the number of vj 2/1 
times the reduced degree of h. 

The extension of the proof of Theorem 7 can be made to depend on the 


following lemma: 


Lemma 1. Jf a polynomial g minimally defines h, any factor of g 
minimally defines the polynomial which it defines. 


For any polynomial f defining a polynomial h, the reduced degree of f, 
<<f>>, is defined to be the reduced degree of h, <<h >>. It follows that 

Let g = 9: © gz and let g; and g» define h, and hz respectively. The 
polynomial g is associated with the product h; © h2.. Hence by Theorem 2 
and the readily proved fact that if g; minimally defines h,, and gz minimally 
defines h. where (h,, h.) = 1, then g,; © gz minimally defines h, © ho. we have 


m>>+ KG >>: 


Since << gi >> = < gi > and by hypothesis 
{Kg <g> 


it follows from (12) that < gi > =<<gi>>. Lemma 1 is an immediate 
consequence. Using the notation of Theorem 7, if 7; (j <7) is minimally 
associated with 4") over C' relative to MW and is minimally associated with g™” 
where g‘"’ is the polynomial defined in connection with Theorem 2, we let 
U,;=U+ +. Then if op, is minimally associated mod U;, 
with h“‘ over C relative to WV it is minimally associated mod Uj; with a poly- 
nomial of degree << h >> uj relative to VW. By Theorem 3 there exists in 
a vector 7; associated with g‘”’ and such that the order of Lar(7i) 


Li (op,) 
mod U, is the order of Ly(op,) mod Lu(ri) = Lu(op,) mod U,. The 
argument of the proof of Theorem 7 may be carried through with no other 


difficulty. 


ON THE MATRIC EQUATION TA = BT + C. 19 


IV. The rank of T where TA — BT. 


1. Commutative case. In this section another approach to the solution 


of the equation 
(138) TA == BT 


is given. It does not lead as immediately to the construction of 7 as does 
that of Sec. II but perhaps gives more insight into the nature of 7’. Moreover, 
explicit results on the possible ranks of T are given. . 

As in Sec. II let Vi be the total n-space and V» the total m-space. Let 
V1, be the space of all vectors € in V, such that Té=0. V4, is linear relative 
to A. 

Let TV; = Since Tg(A)é=g(B)TE, Vo is linear relative to B. 
Moreover g(B)Té = 0 if and only if g(A)€=0 mod V,,. Hence the charac- 
teristic divisors of A mod V,,; are equal to the characteristic divisors of B on 
Io. Moreover, if subsets V;, and V2; of V; and Vz respectively, exist such 
that V,, is linear relative to A and V2, is linear relative to B, and the charac- 
teristic divisors of A mod V,, are equal to the characteristic divisors of 
B on then there exists a such that TA—=BT where 


and = 0, for there exist vectors &, , &, linearly independent 
relative to A mod YV,,;. and associated with the characteristic divisors 
hy, of A mod V,, and there exist vectors linearly 


independent relative to B which form a base for V2;, and which are also asso- 
ciated with the characteristic divisors of Bon V.,. Since La(&, &,° + 
V,, we see that T is completely defined by Té = TVi1 = 0 and 
satisfies the stipulated conditions. The rank of T is clearly the order of V2. 

By the discussion following Theorem 8, for every subset V1, of V, linear 
relative to A, there exists a subset Vo, linear relative to A, such that the 
characteristic divisors of A on V2 equal the characteristic divisors of A mod 


Hence we conclude 


THEOREM 9. The possible ranks of matrices T satisfying TA = BT are 
the orders of subspaces Vio, and V2, linear relative to A and B respectively, 
such that the characteristic divisors of A on Vy. equal the characteristic divisors 
of Bon Vo. 


Let the highest power of an irreducible polynomial h appearing among 
the characteristic divisors of A correspond to the highest power of ) among 
the characteristic divisors of B, the second highest, to the second highest, ete. 


Then by Theorem 5, Theorem 9 yields 


20 M. H. INGRAHAM AND H. C. TRIMBLE. 


THEOREM 10. The possible rank of a matrix T satisfying TA = BT is 
the sum of the degrees of a possible set of common divisors of corresponding 


characteristic divisors (or invariant factors) of A and B. 
and hence we readily get 


THEOREM 11. The maximum rank for all matrices T satisfying TA = 
BT is the sum of the degrees of the greatest common divisors of corresponding 
characteristic divisors (or invariant factors) of A and B. 


The fundamental theorem on the equivalence of the identity of invariant 
factors to similarity of matrices follows at once, since T is non-singular only if 
the rank of 7 equals the order of A and B. 

Theorem 9 may be stated in the form 


THEOREM 12. The possible ranks of matrices T satisfying TA = BT are 
equal to the orders of subspaces of V, and Vz linear relative to A and B 


respectively on which A and B are similar. 


2. The Non-commutative case. This case presents no additional diftli- 
culties not cared for by Sec. IT. 


V. A system isomorphic to the ring of matrices commutative with a 
given matrix. 


1. Commutative case. The following isomorphism in slightly different 
form was given for a special case by Trump,’* where the g; were the elementary 
divisors of A. Later a more general form than that required here was given 
by Jacobson.’? As the derivation at this point is easy and gives explicitly 


the form of the matrices involved, it will be given. 


If 
T,A = AT, (r=1,2 
T.T,A = AT.T, and (7, + 
Let = (A )&; (r = 1,2) 
(T: + = + (A) 
and = (A yar (A) &.. 


Moreover, from equation (9) with f; replaced by -g; we see that «© must 


be divisible by — , so that we may write = and the 
(Ji 9i) 


| 
| 
| 
18 Trump [9] (p. 376). 
29 Jacobson [2] (p. 502). 


ON THE MATRIC EQUATION 7A = BT + C. 21 


algebra of matrices commutative with A is isomorphic to the algebra of trans- 
poses of the matrices (a;;), that is, the algebra of matrices of the form 
= = Yjiji) 
where the elements of the j-th row are reduced modulo gj. 
This matrix is particularly convenient in two cases: 1) when the g; are 
the invariant factors of .1 and 2) when the g; are the characteristic divisors 
of A. To illustrate: In the case of three invariant factors 91, gz, gs, X is of 


the form 


J2 
Y2 Yor 
Yor 
Ys1 Ys2 


where the rows are reduced modulo g;, gz and g; respectively. In the second 


case VV is the direct sum of matrices of the form given below for the case of 


three characteristic divisors g", g", g being irreducible, 2 rz = 13, 


Yar 
Yai Yo2 
Ys1 Ys2 Y33 


the rows being reduced modulo g”, g” and g” respectively. Note that A always 


corresponds to Al. 


2. Non-commutative case. It can be shown *° that the y’s may be picked 
in such a way that the a; of equation (9) may be taken to contain the poly- 
nomial defined by f; divided by the greatest common divisor of that polynomial 
and the polynomial defined by g;. Let this be v;;. Corresponding to T there 
is a matrix .\ of polynomials 2;; = yi; © vi; where the y;; may be suitably 
reduced. This form for «;; is not sufficient to prove that (9) is satisfied. The 
nature of the extra conditions can be seen from the only case we will discuss, 
namely, where the gj and g; are g"? and g‘)), the g being irreducible. By a 
familiar argument all other cases reduce to this though the isomorphism thus 
produced may not always be the most convenient. We will need the following 


lemma: 


LemMA 2. If g is irreducible and defines h and if ht" © f is divisible 
by g\, then f is divisible by g™. 

*°'The proof involves some complications which hardly seem worth including in 
detail. Methods involve those given in Ingraham, Wolf [1] (p. 28). 


= 
| 

|i 
| 


22 M. H. INGRAHAM AND H. C. TRIMBLE. 


Let g2 be the greatest common divisor of f and g‘) and be of lower degree 
than g‘. Then there exist polynomials p and q such that 


pOof+qOg™” 


and hence 
© (pOf+tqOg™) —pOh™Oftq Og g,. 


Each term of the left-hand side of this equation is by hypothesis divisible by 
g‘® but this is not true of the right-hand term, h*" © go, since gz defines a 
lower power of h than h*. 

Moreover, if g‘ ©f is divisible by g“, then © f is divisible by 
g*), for h' © g- © f is divisible by g™ and our statement follows from 
Lemma 2. 

In the case now under consideration vj; = where m4; = smaller 
of ri, rj. If then 

is to be divisible by gs), g © yi; must be divisible by g°"). This is also 
sufficient. 

Moreover, if the y’s satisfy the above relation it follows that 9" © yjx = 
© yjx is divisible by if + mi > 0. Hence 
© yi; © yjx is divisible by if mjx—rjy + mij > 0. 

Hence 


is divisible by g°. It readily follows that the matrices 7’ commutative with 


A are isomorphic to the transforms of matrices with elements 


where the y;; may be reduced modulo g‘”")) and where the yj; are such that 


© yi; is divisible by 


VI. Some applications of the isomorphisms to a study of the ring of 
matrices commutative with A. 


It is the wish of the authors to demonstrate the power of the foregoing 
approach as a working method. The characteristic divisor and invariant 
factor isomorphisms come naturally from a consideration of bases of relative 
linear sets and the condition on a matrix B that BA = AB. 

Using these isomorphisms the standard propositions on the nature of any 


matrix commutative with A * are proved simply, and new results are obtained. 


21 Wedderburn [10]. 


— 


ON THE MATRIC EQUATION 7'A = BT + C. 23 


Not only the form but also the properties of matrices commutative with 
are treated below. Let R(A) be the ring of matrices commutative with A. 


1. The determination of the ring R(A). The problem of finding the 
most general matrix commutative with A is ordinarily reduced, not in general 
rationally, to the case where A has just one characteristic value which is 
assumed without loss of generality to be zero. Then 1 is taken in Jordan 


canonical form with blocks 


of li ((«=1, 2,---+,k) rows and columns along the main diagonal where the 
elementary divisors of A are A4, and 
As an example of the simplest application of the characteristic divisor 
isomorphism consider 
A=/|0 1 0 0 0 


00 0 1 0 


where (6,, 6;) form a proper base relative to .1 for the space, and 6,, 8, are 


minimally associated with A*, A? respectively. 


If B is any matrix such that BA = AB, then B corresponds to 
yy + + + Did) 
( + dod, + Dood 
Then 
B8, = + + + (ard + 
Now 


It is readily checked that 


which also equals B in this case. 


Q 
01 0:--0 0 
Q Q 0 
@a 0 (yo O 
18,, A°5,, 65, = C14 Dis (yo 
(lo 0 0 oe 0 
| 


24 M. H. INGRAHAM AND H. C. TRIMBLE. 


The procedure is general for any matrix A in Jordan canonical form, and 
with just one characteristic divisor which is taken to be zero. 

Given a matrix A, it is not possible in general to find its characteristic 
values rationally in the field K. Wedderburn ** gives a method due to Fro- 
benius for finding B rationally, but criticizes the method because it fails to 
yield explicitly the form of B. An explicit solution of the problem, not wholly 
rational, was given by Rutherford.2* From the present point of view the 
rational solution for B is essentially that given in the following example. Let 


A have the characteristic divisors 


(2+1)2, 


Let 
(" 0 0 0 0 —1) 
0 0 0 0 1 0 
A= 0 1 0 —I1 0 0) 
0 0 1 0 0 0) 
0 —1 0 0 0 0) 
i © 6 6 


where A is not taken in any standard canonical form since we wish to indicate 
a more general procedure. To construct a proper base relative to A, consider 
5, and 8. 
A8, = 8, A78, = — 8, and hence (A? + J)8, = 0. 
= 8; — = 8, — &, 
A*$, = — 28; + 4;, 
A‘§, = — 26, + 8, and hence + 2A? + /)8& = 0. 


Since the linear extension relative to A of 8, consists of all vectors of the form 
4,8, + 0285, 
and the linear extension relative to A of 8 of all vectors of the form 
b,8. + + -+- 
it follows that 8,, 6. are linearly independent relative to A. Hence (8, 8,) 
form a proper base for the space relative to A, and 6 and 8, are minimally 
associated with (A?+-1)? and A?+ 1 respectively which are therefore the 


characteristic divisors of A as stated above. 
If B is any matrix such that BA = AB, then B corresponds to 


22 Wedderburn [10] (p. 106). 
23 Rutherford [7]. 


ON THE MATRIC EQUATION = + C. 25 


+ + + (A? + 1) + 
‘Then 
BS, = + + + dy, A*)82 + (doit + 021A) 8: 
BB, (A? +1) + 124) 82+ (deal -+ 


It is readily checked that 


(Any — bs, — Dor 
RP b., — — — + — 21. + — 
dy; — My + Cn bi, — dh, — C11 0 
bo, (ley — be Doo 
where 


P — (82, Abs, A*8o, Aé,) e. 


(0 0) 0 0 1 0> 
Il 0 —1 0 0) 0) 


P 0) 1 0 —2 0 0 
0 0 1 0 0 0 
0 —1 0 1 0) 0 
0) 0 0 0 
Hence B is 
0 0 Dox — 
0 ly4 — C14 0 0 bi — diy 0 
— bye bi, — — Cn —butdy Ci — Aye 
Aye C11 bi, — di diy — die 
0 + di 0 0 — Cy 0 
Gar be; 0 0 — 


It would be quite possible ** to follow the classical approach to a rational 
solution of the problem of determining the form of any matrix commutative 
with A, to reduce the problem by assuming the characteristic divisors of A 
are all powers of the same irreducible polynomial, and to begin with A in a 
convenient canonical form. Then there could be derived explicitly the form 
of any matrix commutative with this canonical form of A, and hence of any 
matrix commutative with A itself. But the problem of finding the character- 
istic divisors of A is essentially the same as that of setting up the isomorphism 


24 Williamson [12]. 


Doo 
0 
Die 
0 
Aze 


26 M. H. INGRAHAM AND H. C. TRIMBLE. 


relative to A. It should be noted further that there is no need to determine 
the characteristic divisors of A, a process which requires factorization of 
polynomials in the field XK into their irreducible factors. The invariant factors 
of A may be determined without such factorization of polynomials by the 
application of the greatest common divisor process, and then the invariant 
factor isomorphism may be used to find the most general matrix commutative 


with A. 


2. Some properties of the ring R(A). The proof of the following 
theorem is of interest, and may be compared with classical proofs such as that 
given by Wedderburn.*° 

THEOREM 13. Any matrix which is commutative not only with A bul 
also with every matrix commutative with A is a scalar polynomial in A. 


Let the matrices B in R(A) correspond to (b;;).. We seek a matrix C 
in R(A) such that for every B in R(A), BC = CB. 
If (b;;) is the Kronecker delta matrix 8;;, then it follows that 


=0 (mod (1}). 
If (bi;) is taken to be where di. =1; i< it follows that 
Cr = Citi (mod 


Hence (¢i;) = Cyl where by two matrices being congruent we shall mean 
that the corresponding elements in their i-th rows differ by multiples of h! 
and corresponds to ¢,,(A), where ¢,,(A) is a scalar polynomial. 

The theorem of Sylvester: Jf A is non-derogatory, then every matrix 
commutative with A is a polynomial in A, is an obvious consequence of the 
invariant factor isomorphism. For A, being non-derogatory, has a single 
invariant factor, and if BA = AB then B corresponds to p(A) where p is a 
scalar polynomial. Hence B= p(A). 

This theorem has an interesting generalization based upon the concept 


of minimal congruences of a matrix in the isomorphic system. 


If (é. &,° - +,&) form a proper basis for the space relative to A, and 
if the €; are minimally associated with hj (i = 1, 2,- + +,k) respectively where 


the h; are the invariant factors of A, then if BA = AB, suppose G = (bj;) 
corresponds to B under the invariant factor isomorphism. Using small letters 
to denote polynomials in A 

(14) 


25 Wedderburn [10] (pp. 105-106). 


ON THE MATRIC EQUATION 7A = BT + C. 27 


is a minimal relationship of the type to be defined. Since h,G = 0, there is a 
relation 9g,@ + a’;)l =0 such that g, is of minimal degree with leading coefti- 
cient unity. If d= (9:,h,), and mg, + nh, —d, then dG + =0: 
hence d = g, divides h,. Leth; =giq:. Then hiG + = 0, or = 
0. Hence g, divides and we may write Where dy) is reduced 


modulo q;. The minimal linear congruence in G is defined to be 


(15) + aol) =0. 
If rG + sf = 0 it is readily checked that g, divides r, and g, divides s. Hence 
the reduced congruence (15) is unique. 

Since 9,G? + gidioG =0, there is a relation goG? + + =9 
such that g. is of minimal degree with leading coefficient unity. As before g. 
divides gi. If g: then + G + qod’ool =0 and by sub- 
traction (q2t’2: — gitio) + =0. Hence g, divides — and 
gi divides Then gy divides and gs divides As before write 
= = Jollog Where doo are reduced modulo The minimal 


quadratic congruence in G@ is defined to be 
(16) 92 (G? — — 


The process may be continued to derive minimal congruences of higher 
degree in G. If @ is considered as a matrix with elements in a commutative 
ring, apart from reductions and other special properties of the system, ( 
satisfies its characteristic equation formed in the usual way as an identity. 
IIence @ satisfies a congruence of degree at most its order and with leading 
coefficient unity. Thus 


is called the minimal congruence of G if r is minimal. If coefficients are 
reduced by means of equations of lower degree, the equation is uniquely 
determined, 

Under the isomorphism a;;(4) corresponds to a ajj(A)Z, and hence in 
the original system the equations which correspond to the minimal congruences 


set up above are 


(14’) h,(A) =0 

(16’) g2(A) [B? — a2,(A)B — = 0 

(17’) Br —ay,r-1(A) + — an (A)B— (A) = 0, 


Hence 


28 M. H. INGRAHAM AND H. C. TRIMBLE. 


THEOREM 14. If A is an n-th order square matrix with elements in a 
commutative field K and with k invariant factors, and if B is any matrix 
in R(A), then A and B satisfy equations of types 14’ to 17’. 

Corollary 1. B satisfies an equation of degree at most k with leading 
coefficient unity and other coefficients polynomials in A over K. 

Since if A is non-derogatory, k = 1 and hence B= p(A) where p is a 
polynomial over K, Sylvester’s theorem is a consequence of Theorem 14. 


UNIVERSITY OF WISCONSIN. 


BIBLIOGRAPHY. 


1, M. H. Ingraham and M. C. Wolf, “ Relative linear sets and similarity of matrices 
whose elements belong to a division algebra,” Transactions of the American Mathe- 


matical Society, vol. 42 (1937), pp. 16-31. 

2. N. Jacobson, “ Pseudo-linear transformations,” Annals of Mathematics, vol. 38 
(1937), pp. 484-507. 

3. N. H. McCoy, “On quasi commutative matrices,” 7'ransactions of the American 
Mathematical Society, vol. 36 (1934), 2, pp. 327-340. 

4. C. C. MacDuffee, “The theory of matrices,’ Ergebnisse der Math., J. Springer, 
Berlin, 1933. 

5. O. Ore, ‘‘ Formale theorie der linearen differentialgleichungen II,” Journal fiir 
Math., vol. 168 (1932), pp. 233-252. 

6. O. Ore, “Theory of non-commutative polynomials,” 
vol. 34 (1933), pp. 480-508. 

7. D. E. Rutherford, “On the solution of the matrix equation AY + XB = C,” 
Proc. of Sec. of Sciences, Koninklijke Akademie van Wetenschappen te Amsterdam, vol. 
35 (1932), 1, pp. 54-59. 

8. D. E. Rutherford, “On the rational commutant of a square matrix,” 
Amsterdam, vol. 35, pp. 870-875. 

9. P. L. Trump, “On a reduction of a matrix by the group of matrices commutative 
with a given matrix,” Bulletin of the American Mathematical Society, vol. 41 (1935), 


pp. 374-380. 
10. J. H. M. Wedderburn, “ Lectures on matrices,” American Mathematical Nociety 


Annals of Mathematics, 


Proc. 


Colloquium Publications, vol. 17, pp. 102-114. 

11. R. Weitzenbiéck, “ Uber die matrixgleichung AX + XB=—C, 
Sciences, Koninklijke Akademie van Wetenschappen te Amsterdam, vol. 35 (1932), 1, 
pp. 60-61. 

12. J. Williamson, “Idempotent and nilpotent elements of a matrix,” American 


Journal of Mathematics, vol. 58 (1936), pp. 747-758. 


” Proc. of Sec. of 


ON THE DEGREE OF CONVERGENCE OF THE DERIVED SERIES 
OF BIRKHOFF.* ' 


By W. H. McEwen. 


1. Introduction. Let f(z) be a given function and let Sy(v) (VW =—1, 
2,: +) represent the N-th order partial sums of its Birkhoff series, defined 
with respect to a given n-th order linear homogeneous differential system on 
an interval 0 = «= 1 (or more generally aS 2b). For such sums Milne 
has shown that if f(a) is of limited variation on (0, 1), where m is an 
arbitrary positive integer, and if f, f’,- + +, f("-) vanish at 0 and 1, then 

f(x) —Sx(x) = O(1/N™) 
uniformly on 0 = 21. The purpose of the present paper is to show (i) 
that Milne’s result can be extended to the derivatives of Sy(w), so that under 
the same hypotheses 
(x) Sy ) ( x) 0 ( 


uniformly on 0S a= 1, for k =0,1,- +,m—1; (ii) that corresponding 
results for an interior interval 0 << 6=a#=1—8 may be obtained without 


assuming that f, f’, °° +, f("-) vanish at 0 and 1 if a suitable method of 


summation is used. 
2. The sums Sy“)(x). We begin with a brief description of the Birk- 


hoff series. For more detailed information the reader is referred to one of 


the well known papers on the subject.® 
Let the given n-th order differential system be 


(1) L(u) + + +++ ++ = 0, 
Wj(u) =0 (j= 1,2,:--,n), 


in which the functions P.,- + -,P» are continuous and have continuous deriva- 


tives of all orders on (0, 1),and the boundary conditions (consisting of n linearly 
independent linear homogeneous forms in u)(0), w4)(1), 7 = 90, 1, 
n—1) are normalized and regular.t| Let G(a,y:) be the Green’s function 

* Received January 6, 1940. 

1 Presented by title to the American Mathematical Society in September, 1938. 

2W. E. Milne, Transactions of the American Mathematical Society, vol. 19 (1918). 
pp. 143-156. 

® See G. D. Birkhoff, Transactions of the American Mathematical Society, vol. 9 
(1908), pp. 373-395; J. Tamarkin, Rendiconti del Circolo Matematico di Palermo, vol. 
34 (1912), pp. 345-382; M. H. Stone, Transactions of the American Mathematical 


Society, vol. 28 (1926), pp. 695-761. The paper by Stone is the one that the author 


has followed particularly. 
‘For definition of these terms see Birkhoff, loc. cit., p. 382. 


29 


30 W. H. MCEWEN. 


of the system. The poles of G@ are then the characteristic values of (1), and 
these form an infinite sequence of complex numbers {A} which may be arranged 
so that =o. Let {Ri(z,y)} be the corre- 


sponding sequence of residues of G. The Birkhoff series for f(z) may then 


be written 
1 
f(y) Ri(a, y) dy. 


If the poles are all simple * this has the form 
oO 


1 1 


where {uj(x)} and {vi(x)} are the sequences of characteristic solutions of 
system (1) and its adjoint respectively. 

Let Ci, C2, + + + be a system of concentric circles in the complex A-plane 
with centres at AX = 0 and radii A,, where A, < Ag lim Av= o. 


yD 
Let these circles be so drawn as to remain uniformly away from the poles of G° * 
and such that between every two consecutive ones there is at least one pole of G 
and as few others as possible. Let N denote the number of poles (each 
counted according to its multiplicity) enclosed in any specified circle Cy. 
Then the partial sums of the Birkhoff series may be written 


1 
(2) Sy(z) = 1/eni f(y) y3A)dy da, i=—V—1. 
Cy J 0 


A more useful form of (2) is obtained by placing A = p".. The entire A- 
5 

plane is thus made to correspond to a sector & in the p-plane, composed of two 

adjacent sectors of the following set of 2n equal sectors: 


S; lr/nS argpS 1)x/n (J 
The circles Cy are then transformed into ares of circles | p |= R= (Av)? 


lying in the sector =. Let I denote these arcs. Then we can write 
1 
Sy(x) =1/2ni f, f(y)np"*G (x, y; p") dy dp, 
rJo 
and the derivatives of Sy of arbitrary order & may be written 


y(t) =1, riff f(y) mp | Oak dp 


In connection with the circles | p | = F# it is to be noted that R= O(N). 


5 The poles are in general simple when | «| is large, being always so if the system 
is of odd order, or is of even order and self-adjoint. Multiple poles if they occur are 


double. 

* This is always possible. 

7 See Stone, loc. cit., p. 741. The notation {A; B} is used to indicate that A is to 
be taken when x > y and B when @ < y. 


im 


DEGREE OF CONVERGENCE OF DERIVED BIRKHOFF SERIES. 31 


3. The expressions Asymptotic formulas for the deriva- 


tives of G@ of arbitrary order & are given in Stone’s paper ® (which paper will 
be referred to hereafter as (S)). In dealing with these formulas it is neces- 
sary to consider separately the two cases when the system (1) is of odd or of 
even order. For the sake of brevity we shall deal only with the case of odd order 


n= 2n—1, 


and merely remark that the treatment of the case of even order is entirely 


analogous. 
The formulas involve in an important way the n-th roots of —1. Let 
these be denoted by 2,° +, @, and suppose for each sector S the subscripts 


are so distributed that for values of p in that sector 
Re(po,;) S Re(por) Re(pon), Re =“ the real part of.” 


Consider one of the sectors S which make up &, and let y denote the half of 
the arc [ lying in it. Then when p is on y, Re(po;) is negative if 1< w and 
positive if i>, whereas Re(po;) changes sign if «=p, being negative on 
one half of y and positive on the other, vanishing at the mid-point. Let y; 
and y, denote the two halves of y on which Re(pon) is negative and positive 
respectively. 

From (S, p. 745) with the help of (S, Theorem III’, p. 706) we obtain 


the following formula which holds when p is on y;: 


(AG ( : Mus (2, y) E(x, y, p) 
n Mae (2, 4 (a, y, 
wjerei(@ (pw; )* ke ( + ( p) \ 
(poi) p J 


A, 
[Ao] + 0, G 2 


As stated in footnote 7 the notation {.f; B} is used to indicate that .1 is to 
be taken when a2 > y, and B when « < y. The integer m is arbitrary, and in 
section 5 will be identified with the m of the theorem there stated. The func- 
tions .W;, may be taken as continuous and differentiable to all orders on 
0Se=1, 0=y=1, and moreover may be chosen independent of the 
particular sector S. FV is used here, and elsewhere in this paper, to indicate 
any function which is uniformly bounded as p>. The particular nature 
of the expressions in the last term will be made clear later on in section 5. 
For convenience in writing let 


§ Stone, op. cit. 


32 W. H. MCEWEN. 


n 
i= 


{9xs 3 Jus } = {— > (2-y) (poi )* Mis ; (poi )* Mis}, 
i=ut+1 


Then, when p is on 


( \ m A, ( 1 ) 
NE Oak dak = Jxs} + [Oo] | + p™ -k+1 


When p is on yz the formula is similar except for a change in the ranges 
of the summations involved, which now must be (1,4“—1) and (p,n). The 
corresponding formulas for the other sector S of & are exactly the same (with 
the same V;..’s), but with a new distribution of the subscripts of the o’s. 


4, Preliminary lemma. The following result, which was used by Milne, 


will be required in the next section. 


Lemma. Jf F(y) ts of limited variation on (a,b) and c is any constant 
~ 0, then in the half-plane where Re(cp) = 0, 


B 
f. ecu (y)dy = O(1/p), 
a 


It may be proved as follows: 


1 £8 

a PJa 

| 1 


= [eR — erm P(a) — (y))]| 
[| F(8) |+| F(2) + 
=| cp | a 
1 
| cp | 


= 


[| F(B) | + | F(#) | + —V(2)] 


where V(y) is the total variation of F(y) in (a,y). 


5. Degree of convergence of S,‘)(x). In this section we prove the 
first main result of the paper. 


THEOREM J. Let m be an arbitrary positive integer, and suppose f°" (x) 


is of limited variation on (0,1) and f, f,- - +, f("- vanish al 0 and 1. Then 
(7) — Sy (xr) =O 
uniformly on 0=2=1, fork =0,1,° -+-,m—1. 


Consider first the integral 


DEGREE OF CONVERGENCE OF DERIVED BIRKHOFF SERIES. 33 


1 
0 0 
1 
+ f. F(y) dy. 


i=y+1 
On integrating by parts m times, and using the hypothesis that f, f’,- - - , f¢"-? 
vanish at 0 and 1, we obtain 


al 
7 0 


i= 


n 1 
+ 0; (pwi )A m F(m) (y) dy 


But, by hypothesis, f’ is of limited variation on (0,1). Hence, by an 


application of the lemma, 


1 1 
f(y) Jxo}dy = O 
Jo p n-k 


i=1 


Next, consider the inte gral ff Let functions ves (2, y) 
be defined as follows: 
Wis (x,y) =f (y) Mis (2, y), (s==1,2,---,m). 
Then, under the hypothesis of the theorem, it is obvious that yrs, Wnes* * * 
Wes’) vanish at y= 0 and y = 1, where the differentiations are in each case 
with respect to the second argument, the variable y. Also, Yrs”, Wns”, ete. 
are of limited variation on (0,1). Hence, on integrating by parts m — 1 times 


and applying the lemma, we obtain 


f(y) Gir = -0(- ~ =) 


n 
+ Yur ©) (puoi )*? (x; v) wi (pw; 
i=1 


Proceeding in this manner, integrating by parts m — 2 times in the case 
of gx2, m—3 times in the case of gx;, and so on, and then combining the 


various results, we find that 


3) f(y) ) Jes}dy = O =) + f(x) 


+ (f’(@) + (a, > wi (poi )*? 
n 
4 (f(™-)) (a) War "-?) (2, xr) 2) ) (pwi)*". 


n n 
3 


34 W. H. MCEWEN. 


Next, consider the integral 


A description of the determinant A, is given in (S, p. 745 and p. 717). 
For our purpose it will be sufficient to observe that the variable y occurs only 
in the elements of the last column of A,), and all the elements in the corre- 


sponding co-factors are uniformly bounded as p— oo, except for those in the 
first row of A,“ which are all O(p*). Let 0, Y,,- - -, ¥n denote the elements 
of the last column, and let Xo, X1, X2,° - +, Xn be the corresponding co-factors. 


Then 


1 n 1 
A, f(y) dy = Y if(y) dy, 
0 j=l 0 


where (from S, p. 717 with the help of Theorem III’, p. 706) 


(4) Y;=0 ( n =) +( 
i=1 i=p+1 
\ i=l p® i=p+1 p 


Here the functions Nis have the same general properties as the functions M;.s, 
and the numbers 2;, Bj, k; are constants in the boundary conditions of the 
differential system. 

On multiplying (4) by f(y)dy and integrating from 0 to 1, integrating 
by parts m times in the case of the first expression in parenthesis, m — 1 times 
in the case of the term s = 1, and so on, we find, after applying the lemma, 


that 
1 1 
0 p 


Hence, since X; = O(p*), and since the expression [6,] + ¢9#[6,] is known 


to be uniformly bounded away from zero as p—> ©,” we have 
k 


On combining (5) and (3), we obtain 


(f(z) + (2, + + Yum (2, > wi (poi), 


® See Birkhoff, op. cit. 


j 


DEGREE OF CONVERGENCE OF DERIVED BIRKHOFF SERIES. 35 


Although this result was obtained for the case when p is on y;, it holds equally 
when p is on ys, and since the Mj.’s and therefore also the yxs’s are independent 
of the sector S it holds generally when p is on I. 

Finally, let us multiply (6) by (1/(2at))dp and integrate over the arc 
ron which |p|. In this connection we observe that if s is any integer, 
positive, negative or zero, the integral 


= ¥ (p01) *d(pe) 


is zero in every case except s = — 1, when it has the value 2z7i. For, on setting 


z == pw; the integral may be expressed in the form 


f 


where the path of integration is now the entire circumference of the circle 


|z| ==. Hence we can write 


HG FG ) 


But the functions (7,7) + x)) are independent of 
p (and therefore of N), and hence must be identically zero if the sums Sy (z) 
are to converge to f(a) at all. Moreover, it will be observed that these 
functions are linear combinations of the functions f(x), f’(r),: -, f(z), 
with coefficients depending only on the functions (2, 2). 
These coefficients, therefore, must be identically zero if there is to be conver- 
gence at all. But the fact of convergence has been established by the author,’° 
under hypotheses on f(x) which are substantially more general than those 
admitted in the theorem of this section. Hence we conclude that the coeffi- 
cients in question are identically zero. This property of vanishing must be 
regarded as something inherent in the Green’s function of the problem, and 
not in any way dependent on the functions f(z). The author has verified this 
fact by direct calculation in the case k= 1. Thus, finally, since R= O(N), 


we obtain 


10W. H. McEwen, Bulletin of the American Mathematical Society, August, 1939, 
pp. 576-582; Theorem 2, p. 582. Here it is shown that if f(x) satisfies the boundary 
conditions W,(f) =9, W,(L(f)) =(0,..., W; =0; and 
if Ip(f) is integrable, p being an arbitrary positive integer, then lim 8S, (k) = f(k) 
uniformly on (0,1), for k=0,1,.-+-,pn—1. With p suitably chosen ee will exist 
a class of arbitrary functions f(«#) satisfying the hypotheses of both this theorem and 


the one in the text. Hence for all such functions convergence is assured. 


36 W. H. MCEWEN. 


Sy (2) = f(x) +0 
uniformly on 0 = =1, for sk =0,1,- -,m-—1, which proves the theorem. 


6. A summation method. Stone * has indicated a method of summing 
the derived series of Birkhoff (essentially the method of Riesz typical means), 


in which the N-th order sums take the form 


N dit 1 gk 
(1-45) f, f(y) y) dy, l= 0, k = 0, 1, 


i=1 
We shall adapt this to our present needs (with slight modifications) to read 


as follows: 
(8) (1-45) f, f(y) age (x, y) dy 


where, for reasons which will appear presently, « is assumed to be any positive 

integer such that 4an =m. Denoting by oy(z) the sums (8) corresponding 

to the case / = 0, we are thus led to consider the sums oy") (a2) for k = 0, 1, 
- +,m—1, which may be represented by the contour integral formula 


oy) (2) = ) f np f(y)dydp, i= V—1. 


Rtan 


7. Degree of convergence of oy*)(x). In this section we discard from 
the hypothesis the requirement that f, f’,- - -,f(") vanish at 0 and 1, and 


prove the following result. 


THEOREM IJ. Jf f'™ (x) is of limited variation on (0,1), then 


] 


uniformly on 0 <8 SxS1—6, fork =0,1,---,m—1. 
Under the hypothesis of this theorem the asymptotic formula (6) will 


1 OG OG 
n —;—— }f(y)dy, 
provided certain additional terms are now added to it. These terms arise 


from the various operations of integration by parts because of the failure of 
f, to vanish at 0 and 1. Let ®(z,p) denote these terms. Then, 


again represent 


when p is on y;, we have explicitly 


11 Stone, op. cit. 


DEGREE OF CONVERGENCE OF DERIVED BIRKHOFF SERIES. 37 


i=pr+1 


(f’(9) +- (4, 0)) p> (pw; 
— + Yar (2 1)) Doi 
i=y+1 


(fim) (0) (2, 0) 4+. ‘ + vem (2, 0)) (pox) 


i=u+1 


On multiplying ®(2, p) by (1/(27i)) (1 — p**"/R*4") "dp and integrating 
over y;, we obtain the following set of integrals: 


"1 
(2-1) 
f, l — Pian dp, (t=p+ 
7 V1 


where j= 1,2,---+,m. But with restricted to 0< 6S 2=1—86, each 
of these integrals for which i+4 yp converges to zero to an infinite order in 
1/R as R— co (since in each case the real part of the exponential factor is 
<0). On the other hand, when i= vp, we can easily show, following the 
method of (S, p. 715, in the proof of Lemma V), that when = is thus restricted 


4an K 
J (: | = = +j = 0 (ts) 


The proof is as follows. Let ¢ be an angle measured from the bisecting ray 
of S (positively in the direction of y,), and set pop = iRet?, Then, as p varies 
over yi, will vary over 0 = S w/(2n), and moreover ¢ will satisfy 0S ¢$/2 
Ssin ¢. Hence, since | 1— p'/R1" |™ = Co”, we have 


m 1/(2n) C Rr/(2n) 
"1 0 0 


& K 


Similar results hold on the other parts of T. Hence 


1 pton m 1 
(10) on (z, p) dp = <— O 


Next, consider the saad of multiplying the right hand side of formula 
(6) by (1/(2zi) ) (1 — p*”/R*")™dp and integrating over T. The integrals 


which now appear have the form 


38 W. H. MCEWEN. 


n 
>> (pw: ) — p*a"/R12") "d (poi) = f — z44n/Rtan) mdz, 
e T 

s=k—1,--+-,k—m, 


the latter integration being taken over the entire circumference of the circle 
|z|—R. But 


gtan m m 1 ym 
dt — Fran giantad, + + fe dz, 


and the integrals on the right being all either zero or 2zi, it follows that the 
right hand side may be written 


f 


since, by hypothesis, 4an =m. Furthermore, 28dz == (0 except when 
s = —1, when it has the value 271. Hence, by the argument of section 5, 


and by virtue of (10), we conclude finally that 


1 
(2) = f(x) +0 (xz) 


uniformly on 0< 6S for k=0,1,---,m—1. 


Mount ALLISON UNIVERSITY, 
SACKVILLE, N. B., CANADA. 


ORDERED TOPOLOGICAL SPACES.* 


By SAMUEL EILENBERG. 


1. <A topological space * X will be called ordered if a relation < is given 
satisfying the following conditions 
(1.1) For any x, ye X one and only one of the relationsx< y,r=y,y¥ <2 
holds, 
(1.2) <z, thenz<z, 
(1.3) If vw, yeX and x«<y then there are neighborhoods U(x) of x and 
U(y) of y such that x < y and < y whenever U(x) and y’« U(y). 
Given xe X denote by Az the set of points ye X such that y <x and by 
B, the set of points ye X such that x < y. Conditions (1.1) and (1.3) are 
then equivalent to the following 
+B, 
(3. 3’) Az and By, are open. 
Although we start with a general topological space we have that 
(1.4) very ordered space X is a Hausdorff space.’ 
Proof. Le! «<y. If there is an element z such that x<2< y then 
xe Azand ye L.. the sets Az and B, being open and disjoint because of (1. 1’) 


and (1. 3’). 
If no such z exists then A,B, = 0, xe Ay, ye Br and Ay and By are open 


because of (1.37). 
2. Things become simpler if X is supposed connected. 
(2.1) Jf X is connected then (1.2) is a consequence of (1.1) and (1.3). 
Proof. Letx<yandy<z. Since ze B,, therefore — B, 
and by (1. 1’) 


X being connected it follows from (1. 1’) and (1. 3’) that Ay + y is connected. 


* Received March 4, 1940. 
1A topological space is a neighborhood space satisfying the first three axioms of F- 
Hausdorff, Grundzuge der Mengenlehre, Leipzig, 1914, p. 213. 
2 Tbid., p. 213. 
39 


40 SAMUEL EILENBERG. 


Since by (1. 1’) and (1. 3’), A, and B, are open and disjoint and since Az 


therefore 


Since ze A, it follows that 2e A, and therefore x < z. 
(2.2) If X is ordered and connected then the order-type of X is continuous. 


Proof. Let X¥ =P+Q,P~0FQ be a decomposition such that x < y 
whenever re P, ye Q. 

Suppose that P has no last element and that Q has no first element. 
Therefore given xe P there is a point 2, ¢ P such that x < 2. By (1.3) there 
is a neighborhood U(x) of x such that 2” < x, whenever z’« U(x). Therefore 
U(x) C P and P is open. Similarly we prove that Q is open. This con- 
tradicts the connectedness of X since PQ = 0. 

Suppose now that z is the last element of P and that y is the first element 
of Q, then P= A, and Q=B;. By (1.3’) P and Q are open which leads 
to a contradiction again. 


3. In this section we shall be concerned with the possibility of estab- 
lishing an order in a given topological connected space X. 

Let X X X be the product space consisting of all couples (x,y) where 
z,yeX.* Let P(X) be the subset of XY & X determined by the condition 
LY. 

THEOREM J. A topological connected*® space X can be ordered if and 
only if P(X) is not connected. 


Proof.- Suppose that X is ordered. Let A(X) be the subset of P(X) 
consisting of all points (z,y) such that x<y. Similarly we define B(.1’) 
by the condition y< z. From (1.1) it follows that 


P(X) =A(X)+B(X), A(X) B(X) =0. 


From (1.3) it follows that A(X) and B(X) are open. Therefore P(X) is 


not connected. 
Given (x,y) « P(X) let 
A(2,y) = (y,@) 


Clearly A is a homeomorphic transformation of P(X) on itself. 
In order to prove the second part of Th. I let us suppose first that a 


5 Tbid., p. 90. 
4Note that (2,y) (y,v) unless y. 
5 The case when X contains no more than one point is excluded from our con- 


siderations. 


ORDERED TOPOLOGICAL SPACES. 41 


decomposition into two open disjoint sets P(X) = A+ B is given such that 
A(A) = 8B. Define the relation < by writing x < y if and only if (zy) «A. 
Clearly (1.1) and (1.3) are satisfied. Since X is connected it follows from 
(2.1) that (1. 2) is satisfied too and therefore XY is ordered by the relation <. 
Hence the proof of Th. I reduces to the following 


(3.1) If X is connected and P(X) is not connected, then P(X) consists of 
two components A and B such that A(A) = B. 


Proof. Assume on the contrary that 

P(X) =C,+ C2, CiA(C1) 40 
Let D,; = C,A(C,) and D, = C,-+ A(C2). We then have 
(*) P(X) =D,+ D,, 0+ D,, D,D, = 0 = D.D,, 
(**) | 


Given xe X let Mz be the set of points y such that (2, y) e D, and let Nz 
be the set of points y such that (z,y) «D2. It follows from (*) that 


X being connected this implies that the sets M;— M,+ <2 and N,=N,+2 
are connected. 

Let ye Mz and ze Nz. It follows that (z,y) «D, and ynone Nz. There- 
fore Nx X yC P(X) =D,+ D2. Since Nez X y is connected and (2,y) € 
Nz X y therefore Nz XK yC D, and in particular (z,y) ¢«D,. Similarly we 
have X zC and therefore (y,z) «D2. This, however, contradicts (**) 
and therefore one of the sets Mz or Nz must be empty. 

Let (v,y)eD,. It follows that 0, therefore and 
(x, y’) « D, for every By (**) we have then also (y’,2) Di. 
Consequently M, 0 and therefore M, — X —y’ for every y’eX—z. We 
have therefore proved that M, = X — a for every xe X, therefore D, = P(X) 
and = 0 contradicting (*). 

Comparing (3.1) with the argument used above to prove that P(X) is 
not connected if XY is connected and ordered we obtain that 


(3.2) If X is ordered and connected® then A(X) and-B(X) are the com- 
ponents of P(X). 


4, In this section the uniqueness of the order in a connected space will 


be established. 


42 SAMUEL EILENBERG. 


(4.1) Let X and Y be two ordered connected spaces. Every (1-1) continuous 
mapping of X on Y either preserves the order or reverses it. 


Proof. Let ¢ be the mapping of XY on Y. Given (z,y) « P(X) let 


v(x, y) = ($(x), o(y)) 
Clearly y is a (1-1) continuous mapping of P(X) on P(Y). Because of (3. 2) 
we have either y(A(X)) = A(Y) or y(A(X)) = B(Y). In the first case 
preserves the order, in the other it reverses it. 


THEOREM II. Two orderings of a connected topological space X are 
either identical or inverse to each other. 


Proof, follows from (4.1) by taking Y — Y and considering the identity 
transformation of X on itself. 


5. In this section we shall establish an inverse of (4.1) under the ad- 


ditional hypothesis of local connectedness. 


(5.1) Let X be an ordered space, Y an ordered connected space, and A an 
open and connected subset of Y. For every (1-1) order preserving (or 
reversing) mapping of X on Y the set ¢* (A) ts open. 


Proof. By (2.2) the order-type of Y is continuous. Since A is con- 
nected it follows that Y=P-+A-+Q where yeP, ye A and imply 
y <y<y"’. Since A is open it can easily be seen that if P ~ 0 then there is 
a last element y; of P. Similarly if Q@ ~0 there is a first element y. of Q. 
It follows that either A = Y, or A= A,,, or A = By, or A = A,,Bz,. 

Let z; = ¢" and (yz). Since ¢ is order preserving there- 
fore #1! (Ay,) =Az, and ¢* (By,) = Bz, It follows that either ¢* (A) = X, 
or (A) Az,, or (A) — Bz, or (A) = Az,Be, Hence (A) 
is open by (1. 3’). 

(5.2) Let X be an ordered space and Y an ordered connected and locally 
connected space. Every (1-1) order preserving (or reversing) mapping of X 
on Y ts continuous. 

Proof. Let @ be the mapping. From (5.1) it follows that ¢' (A) is 
open for every open connected set AC Y. Since Y is locally connected this 
implies that ¢*%(U) is open for every open set UC Y. Therefore ¢ is 
continuous. 


6. Adding separability to our hypotheses we shall give a new charac- 


terization of the connected subsets of the linear continuum. 


| 
| 
4 


ORDERED TOPOLOGICAL SPACES. 43 


(6.1) A connected separable topological space X can be mapped (1-1) and 
continuously on a subset of the linear continuum if and only if P(X) is not 
connected.® 


Proof. Let @ be a (1-1) continuous mapping of X on a linear set Y. 
Clearly P(Y) is then a (1-1) continuous image of P(X) and since P(Y) is 
not connected, P(X) is not connected. 

Suppose now that P(X) is not connected. By Th. I the space XY can be 
ordered and by (2.2) its order type will be continuous. 

Let AC X be an enumerable set such that A=W. Let x < y; since 
the order of XY is continuous there is a z such that r<z<y. Therefore 
A,B, ~ 0 and since A,By is open by (1. 3’) it follows that AA,Bz 40 which 
means that there is a 2’e A such that 7 < 2’ << y. The set A is therefore dense 
in X in the sense of order. 

Since the order type of Y is continuous and there is an enumerable subset 
of X dense in X in the sense of order, therefore there is a (1-1) order pre- 
serving correspondence @ mapping X on the open, half-open or closed inter- 
val Y on the straight line.*? Since Y is locally connected, therefore by (5. 2) 
¢ is continuous. This proves (6.1). 

The inverse #' is also (1-1) and order preserving. Hence if X is locally 
connected $1 is continuous by (5.2) and ¢ is a homeomorphism. Hence we 
have proved 


THeorEM III. A connecled® locally connected separable topological 
space X is homeomorphic with a subset of the linear continum if and only if 
P(X) is not connected. 


7. Let Xq and Xg be two topologizations of an abstract set X. The 
topology of Vg is said to be stronger than the topology of Vg if the class of 
open sets in \, is a proper subclass of the class of open sets in Xg, in other 
words if the identity mapping of XY on itself induces a continuous mapping 
of Vg on \, which is not a homeomorphism. In this way all possible top- 
ologizations of Y form a partially ordered set.® 

Given an ordered abstract set Y we shall consider the class [X] of all 
topologizations of XY which lead to an ordered topological space with respect 
to the given order. It follows from (1.4) that [XY] contains only Hausdorff 


topologies. 


® For X compact this theorem was announced without proof by C. Pauc, C, R. Paris 
203 (1936), p. 154. 

7 Hausdorff, loc. cit., p. 101. 

8 Garrett Birkhoff, Fundamenta Mathematicae, vol. 26 (1936), p. 156. 


P 
d 
i 


44 SAMUEL EILENBERG. 


We may consider X with the discrete topology Xo, i.e. the topology for 
which every set is open. Of course XY, is the weakest topology in the class [X’]. 

We shall now define a topology XY, for XY which will be the strongest one 
in the class [X]. 

Let z, ye X and < y. Consider every one of the sets Az, Coy = AyBz, 
By, as a neighborhood of every one of its points. Let X, be the topological 
space thus obtained. It is clear that (1. 3’) is satisfied and therefore X, 
belongs to the class [X]. 

Given any topology Xq in [X] the sets Az, Cry and By are open because 
of (1.3’). It follows that X, is not weaker than Yq. Conversely given any 
topology Xq_ not weaker than X, the sets Az and By are open in XY. (1.3’) 
being satisfied X,_ belongs to [X]. Hence we have proved that 
(7.1) The class [X] consists of X, and all topologies weaker than X,. 

Since the topology X, is entirely defined in terms of order, therefore 


(7.2) If X and ¥ are ordered sets, then every (1-1) order preserving 
mapping of X on Y induces an order preserving homeomorphism of X, and Y,. 

8. In this section the structure of X, will be described in the case when 
X is ordered continuously. 
(8.1) For every ordered set X the following conditions are equivalent: 

(a) The order-type of X is continuous, 

(b) X, is connected and locally connected, 

(c) X can be topologized so as to become an ordered connected topological 
space. 

Proof. Clearly (b) implies (c). From (2.2) it follows that (c) implies 
(a). We shall prove now that (a) implies (b). 

Let x < y and let Y be one of the sets Y, Az, AyBz or Ay. We shall 
prove that if (a) holds then Y is a connected subset of X,. Assume on the 


contrary that 
(*) Y=—=D,+ Dz, D,D. = 0 = D,D,, Di, Dz. 
Let 7, €D,, Do, < and let 
P=D,Az, + 3Az where ze Az,, Q=X—P. 


Clearly and z,«Q. Also if and <7 then 2”eP. It follows 
that 


X=—P+Q, PAIFQ, PQ = 0, 


ORDERED TOPOLOGICAL SPACES. 45 


and that z < y whenever xe P, ye Q. The ordering of X being continuous it 
follows that either P has a last element or Q a first element. 

Suppose that P has a last element 2. It follows that x» ¢ D,Az, and that 
ze D. for every element ze Y such that x < z< 2, or in other words that 
YCa er, De. <A glance at the definition of Y shows that C.,2,@ Y since 
%,t2€Y. It follows that Cr,2,C Dz. Since the ordering of X is continuous 
it is clear that the set Cz,2, has points in common with every neighborhood of 
and therefore D.. It follows that 2) contradicting (*). 

Suppose now that Q has a first element y. Since z,¢D.2Q therefore 
either Yo = @2 OY Yo <2. In both cases it follows from the definition of P 
that yo De. 

Given any y < yo we have Cy,,C P and therefore D,Cy,~ 0. Since the 
ordering of X is continuous every neighborhood U of y) must contain the set 
Cy, for some y < yo. It follows that D,U 0, therefore yo ¢D, and finally 
yo D,D, contradicting (*) 


(8.2) Given a set X ordered continuously, there is exactly one topologization 
of X which leads to a connected and locally connected ordered space. 


Proof. It follows from (8.1) that there is at least one such topologiza- 
tion. Let VY, and ‘’g be two topologizations of XY both leading to connected 
and locally connected ordered spaces. The identity transformation of XY on 
itself induces an order preserving (1-1) transformation of Yq on Xg and 
conversely. Since \g and Vg are both connected and locally connected, it 
follows from (5.2) that these transformations are continuous and therefore 


~ 


establish the identity of Yq and Xg. 


UNIVERSITY OF MICHIGAN. 


A GENERALIZATION OF A METRIC SPACE WITH APPLICATIONS 
TO SPACES WHOSE ELEMENTS ARE SETS.* 


By G. BALEY PRICE. 


1. Introduction. The metric spaces were among the earliest of the 
abstract spaces studied; later the general topological spaces were developed. 
This paper shows how a topological structure can be introduced by means of a 
function which is more general than the ordinary distance function and 
reduces to it in a special case, and investigates the properties of this topology. 
Conditions under which the space is metricisable are given in section 5. The 
investigation was suggested by a study of topological structures in spaces 
whose elements are sets (see Hausdorff [7%, pp. 145-150], Kuratowski [9, pp. 
89-92, 152-158]) and, in particular, by structures based on symmetric differ- 
ences of sets (see MacNeille [10], Wazewski [17], Fréchet [4], Nikodym 
[13], Szpilrajn [15]). The paper is divided into two parts; the theory of 
the generalized metric space is developed in the first part, and the results are 
applied to spaces whose elements are sets in the second part. The values of 
the generalized distance function are elements in a certain partially ordered 
space; the results thus have some connection, although not close, with Kan- 
torovitch’s [8] partially ordered spaces. 


PART I. A Generalization of a Metric Space. 


2. The partially ordered space D. Let D be a space with elements d, 
e,- **, a partial order < defined for some pairs of elements in D, and an 
operation + defined for every pair of elements. We make the following pos- 
tulates concerning D: 
(2.1) The relation d < e holds for some pairs of elements d, e in ®. 
(2.2) The partial ordering is transitive, that is, (d << e) (e< f)-(d<f). 
(2.3) (d+ and d+ e—e-4 d for every pair of elements d, e in 9. 
(2.4) (di <e:) e2) > (di + dz < + @2). 


3. The metric. Let € be a subset of D with the following property: 


* Received July 18, 1939; Revised August 15, 1940. 
1 Numbers in square brackets refer to the bibliography at the end. 


46 


| 


A GENERALIZATION OF A METRIC SPACE. 4Y 


(3.1) Hyrornesis I. Given any element eo in ©, there exist elements e,, 
é, in such that e, + S ep. 


Let K be a space with elements z, y,---. Let d(z,y) be a function 


with values in D which is defined for every pair of elements z,y in K, and 
which has the following properties: 


(3.2) d(z,x) <e for every e in € and zw in K; 
(3.3) d(2x,y) < e for every e in © implies x= y; 
(3.4) d(z,y) =d(y,2); 

(3.5) d(a,z) Sd(z,y) +d(y,z). 


4, Topological structure in K. The set of points z in K such that 
d(x, %) < where and will be called a sphere with center 
and radius é@. A point a» is a point of accumulation of a set FG K if and 
only if every sphere with center x) contains a point of H# distinct from a. A 
set is closed if it contains all of its points of accumulation and open if it is 
the complement of a closed set. The derived set EH’ of H is defined in the 


usual way. 


(4.1) THeEorem. Jf 2, x2 are two distinct elements of K, there eaist two 


disjoint spheres with centers x,, o. 


By (3.3) there exists a sphere d(z,2,) < e) which does not contain 2>. 
Let e,, be elements of such that e, + es Se. If the spheres d(x, < é1, 
d(x,22) < é had an element xz in common, we would have d(a, 72) S d(%, x) 
+ d(a, < + S e, a contradiction of the definition of eo. 


5. Further hypotheses on © and their consequences. In this section 
we shall assume without further mention that D and € satisfy all of the 
assumptions of sections 2 and 3, and that € satisfies certain further hypotheses 


as stated. 
(5.1) Hyporuesis II. Given any two elements and deD, d< 
there exists an element ee © such that d+eSe@. 
(5.2) THrorem. € satisfies Hypothesis II, every sphere d(x,2%) < 
is an open set. 

Let 2, be any point of d(z,%) <e@. Then d(%,%)=d<e&. By 


Hypothesis II there is an e in € such that the sphere d(z,2,) < e is contained 
in d(2,%) <@, for d(x,%) Sd(a,27,) +d(%,%) <Cd+eSe and 


48 G. BALEY PRICE. 


d(x,%) < by (2.2). Since every point of d(z, 2) < é is the center of 
some sphere contained in the set, it is an open set. 


(5.3) Hyprornesis III. Given any two elements e,, e, in ©, there exists 
a third element e in such that eS e, and eS es. 


From Hypotheses I and III it follows that for each e) in €& there exists 
an element e such that e+ ee. For there exist elements e,, e2 such that 
+ S and an element e such that ee, andeSe. Thene+eSe 
by (2.2) and (2.4). The following theorem is a consequence of this fact. 


(5.4) THeorEM. € satisfies Hypothesis III, there is a sphere d(x, x) < 
e contained in the product of any two spheres d(x, 2%) < d(%,%) < es. 


(5.5) THeorEeM. If € satisfies Hypotheses I, II, III, the space K is regular, 
that is, if G is an open set, and if x) ¢ G, there exists an open set which contains 


z and whose closure ts contained in G. 


Since G is open, there is a sphere d(z,2)) < @) contained in G. By 
Hypotheses I, III there is an element e such that e+ ee. We shall show 
that the sphere d(z,z)) < e, which is open by Theorem 5.2, is a set whose 
closure is contained in d(z,2)) < @) and hence in G. Suppose a point x* not 
contained in d(2z,2)) < & were a point of accumulation of the set d(x, 2%) < e. 
Then the sphere d(z,2*) < e contains an element 2, in d(z,2)) < e, and 
d(x*, 2%) S d(x*,a,) + d(a%1,%) KeteSe. But < contra- 
dicts the assumption that z* is not contained in d(2,a%) < @. The proof is 
complete. 


(5.6) THeroreM. If € satisfies Hypotheses I, II, III, and if a sphere with 
center 2 be defined as a neighborhood of xo, then K is a regular Hausdorff 
space. 

The first postulate of Hausdorff (see Kuratowski [9, p. 28]) is satisfied 
since every point z in K has neighborhoods. The next postulate is satisfied 
as a result of Theorem 5.4. If d(x, x) < @ is a neighborhood of x and 2, 
is a point in it, there is a neighborhood of z, contained in the given neighbor- 
hood of 2; this result follows from Hypothesis II as shown in the proof of 
Theorem 5.2. Again, if x, and 2. are two distinct elements of K, there exist, 
by Theorem 4.1, neighborhoods d(x,2,) <e; and d(z,r2) < e, without 
common points. Finally, the space is regular by Theorem 5. 5. 

A set F in K is said to be totally bounded if it can be covered by a finite 
number of spheres of any given radius e in ©. A set £ is said to be compact 
in K if every infinite set in H has a point of accumulation in K. 


ij 


A GENERALIZATION OF A METRIC SPACE. 49 


(5.7%) THerorem. Jf € satisfies Hypothesis III, a compact set E is totally 
bounded. 


Let e in © be given. Choose any two elements 2, 2, in FY such that z, 
is not contained in d(z,x,) <e. Choose a third point xz; which is not con- 
tained in either of the spheres d(z,x,) < e, Continue in this 
manner to choose points 2, #2,: - - ; if this set is infinite, it has a point of 
accumulation #). By Hypothesis I there exist elements e;, e’, in & such that 
é:+e.Se. The sphere d(x,2)) < contains a point distinct from 
ao, of the set selected. By Theorem 4. 1 there is a sphere d(x, a) < e”’2 which 
does not contain 2». By Hypothesis III there is an element e, in € such that 
S e's, Ses. The sphere d(x,x)) < contains an element 2 distinct 
from 2; since d(2,%) < é. is contained in d(2z,a) < es, which does not 
contain Zm, the elements 2, and 2, are distinct. But d(am,%n) S d(&m, 2%) + 
d(a, tn) + Sei +e’. Se. This contradiction of the definition of 


the set establishes the theorem. 


(5.8) Hyprornests IV. Given any element d in D, there is an element e 
in © such that 


A set F in K is said to be bounded if and only if there is an element e in 
such that d(x, y) < e for every pair of elements y in If € satisfies 
Hypothesis IV, the sum of any two bounded sets is a bounded set. Let 
< for x, in and d(y, yo) < for y, yo in Then d(z,y) S 
d(x, 2%) + d(x, Yo) + d(yo,y), and the stated result follows from (2.3), 
(2.4), and Hypothesis IV. 


(5.9) Hyprornests V. Giren any two elements e,, of there is an 


element e in © such that ey Se and e. Se. 


If © satisfies Hypotheses LV and V, a totally bounded set F is bounded. 
For since # is totally bounded, it can be covered by a finite number of spheres 
d(x, + +, A(X, < Let denote any point in < eo. 
From (3.5), (2.3), and Hypothesis IV it follows that d(yi, yj) < 
xj) + S eij, where ej;¢€. By (2.2) and a repeated application of 
Hypothesis V we see that there exists an element e in € such that e;; Se for 
alli and j. It follows that is bounded. 
(5.10) Hypornests VI. There exists a denumerable set of elements e,, 
C2, 3° * * in &, to be denoted by {ex}, such that if e is any element in &, there 
is an element ex, for which e, S e. 
(5.11) Tueorem. Jf € satisfies Hypotheses I, III, VI, the space K is 
metricisable. 

4 


50 G. BALEY PRICE. 


First we shall show that {ex} contains a subset {e,,} with the following 
properties : 


(5.12) e& 
(5.13) ex, + ex, S 5 


3.14) given any ex in {ex}, there exists an e;, in {e,} such that e;, S ex. 


IV 


It follows from Hypothesis VI and (5.14) that the points of accumulation 
of sets and the limits of sequences are the same in terms of ©, {e;}, and {e.,}. 

The subset {ex,} of {ex} can be chosen as follows. Set e;,—=e,. Let e* 
be an element in € such that e*, + e*, = e;,; such an element exists by 
Hypotheses I and III. Choose e;, as an element of {e,} such that ex, S e*), 
€;, €2. This choice is possible, for by Hypothesis III and (2.2) nig is an 
element e’; of © such that e’; S by there is an 
element in {e,} such that e’; finally. by (2.2) and ( Ck, = 
€;, €2 and e, + e, S e*, + e*, S ex,. This process can be aoe for sup- 


pose that elements ¢x,,° + +, ex, satisfying (5.12), a 13) have been chosen. 


Choose elements e*,, e’, in © such that e*,+ e*, S ex, and e’, S e*,, 


@2,° * *,@r4;. Finally, choose ex,,, as an element of {ex} such that ex,,, S e’r. 
It is clear that the set thus chosen satisfies (5.12), (5.13), (5. 14). 
In terms of the sequence (5.12) we shall define a distance function 


p(z,y) with the following properties: 

(5.15) for each pair of elements z, y in A, p(z,y) is a non-negative real 
number ; 

(5.16) p(z,y) =0 if and only if c—y; 

(5.17) ; 

(5.18) if <e and p(y,z) then p(#,z) < 2e«. 


Such a function can be defined as follows: set p(z, y) = 1/2‘ if and only if 
d(x,y) <e, is true and d(z,y) < e,,, is false; set p(z,y) = 0 if and only 
if d(z,y) < ex, for all 1 Then oy the properties of d(z,y), it is obvious 
that p(x, y) satisfies (5.15), (5.16), (5.17). Also it satisfies (5.18), for 
suppose p(z, = and p(y, = = 1/2%. Then d(z,z) Sd(a,y) + 
d(y,2) < ex, + er, S ex, + ex, S ex,., and p(#,z) S1/2'*. It follows easily 
that (5.18) is satisfied. It is to pea observed that the topology based on the 
distance function p(x, y) is equivalent to that based on d(a, y) and the set {ex}. 

Finally, Frink [5] has shown that it is possible to metricise K by a 
distance function which is topologically equivalent to p(z,y). The proof is 


complete. 


| 

| 

| 


A GENERALIZATION OF A METRIC SPACE. 51 


6. First examples. Let D be the class of real numbers x = 0, and let 
& be the real numbers > 0. Furthermore, let < and + have their usual 
meaning in the system of real numbers. Then ® satisfies all the assumptions 
in section 2, and € satisfies Hypotheses I-VI. A space K with a metric 
d(x,y) having the properties stated in section 3 is an ordinary metric space. 

Consider next the partially ordered spaces of Kantorovitch [8]. Let X 
be a class of elements 2 which form an additive abelian group. Furthermore, 
let there be a relation > defined so that for some of the elements x in X the 
relation 2 > 0 holds. Kantorovitch assumes that this system satisfies the 


following postulates : 
(6.1) The relation z > 0 excludes 2 = 0. 
(6.2) Ifa, >0 and > 0, then x, + x2. > 0. 


(6.3) To each element ae X there corresponds at least one element 2; « X 


such that 2, = 0 and 7, —rt#=2 0. 


(6.4) For every set bounded above there exists a least upper bound sup L. 


In some cases it is assumed in addition that Y is a vector space over the 
real number system. Then the following postulate is applicable: 
(6.5) Ifa> 0. and if A> 0 is a real number, then Az > 0. 


If the first four postulates are satisfied in Y, it is called a partially ordered 
topological group. If in addition the fifth postulate is satisfied in Y, it is 
called a linear partially ordered space. 

If 2, > 0, we say >2,. In a partially ordered space in which 
the first four postulates are satisfied it is possible to define an absolute value 

a| of w; the absolute value of 2 is an element in XY and has the formal 
properties of the absolute value of a real number. 

Let X be a linear partially ordered space, D the subset of elements x = 0, 
and & the subset of elements x > 0. Then D and € satisfy all the assumptions 
of sections 2 and 3. If K be taken as X, and if d(a,y) be defined as | «—y |, 
it follows from the properties of the absolute value in X that d(z,y) has 
properties (3.2), (3.3), (3.4), (3.5). It should be observed, however, that 
¢& may not satisfy Hypothesis III in this case, in fact. a simple linear partially 
ordered space can be formed from the Euclidean plane in which € satisfies 
ITypotheses I, II, IV, V, VI but fails to satisfy Iypothesis III. The sum 
+ of two elements a: and x2: 72) is the element (& + &, 
m + 2); the element x: (é,) follows 0: (0,0), that is « > 0, if and only 
if €20 and »=0 with the inequality holding in at least one case; the 

|). The only element which pre- 


absolute value | | of a: is (| é 


52 G. BALEY PRICE. 


cedes two elements (¢,0) and (0,7) in D is (0,0), which is not in &; hence 
Hypothesis III is not satisfied. It should be observed also that there are 
many other subsets of D which satisfy Hypothesis I as well as certain of the 
others and may therefore be chosen as the set ©. Finally, it may be observed 
that (6.5) is a much stronger hypothesis than is required for Hypotheses IT] 
and V. 
Part II. Spaces Whose Elements Are Sets. 

7. Symmetric differences of sets. Let K denote a class of elements with 
subsets Y, Y,- - -. The symmetric difference s(X,, X2) of two sets Xz is 
defined by 


1) Xo) — Ai Xo 
The symmetric differences of sets have the following properties: 


(7.2) (a) s(X,X)=0; 
(b) s(X1, X2) =s(X2, X1) ; 


(c) s(Xi,X2) D0 
(d) s(X1,Xs) s(X1, X2) + s(X2, Xs). 
(7. 3) s(X,, Xi + X2) = 3(X,X>, X2). 


(7.6) For every m2 1 
(a) s[X, +) + S 8(X, Xm) 
+ 3(X, 
+ 8(X,Xma) 
(7.7) For every n=1 
+ 8(Xne, Xn) 5 
+ s(Xni, Xns2) +: 
(7.8) D X2) | = [8(X1, Xs) 8 (Mi, X2) 
Xs) ] = [8(Xi, X2)8(X2, Xs) — 0]. 


(7.9) (a) = s(X1, XiX2) + £2) 5 
(b) s(X,, Xo) = X1X2) + 8(X1,X1 + X2). 


The first relation in (7.9) follows from (7.8); the second follows from 


i 


h 


A GENERALIZATION OF A METRIC SPACE. 53 


the first and (7.3). The verification of the remaining relations can be 


supplied from (7.1). 


8. Introduction of a structure in a system of sets Yt. Let Dt denote 
a system of sets in K which is a ring, a field, a o-system, and a $-system (see 
Hahn [6, pp. 10-20]). The symmetric difference of two sets in Mt is a set 
in M. Let the sets X in Mt be the elements of the space D (see section 2). 
We introduce a partial order in D in terms of point-set inclusion; more pre- 
cisely, if X¥, YeD, X < Y is defined to be equivalent tt YS2Y. Finally, 
the operation + of section 2 is here defined to be point-set addition of sets 
in D. It can be verified that (2.1)-(2.4) hold for the partial order < and 
the operation + as thus defined. 

Next, let © with elements F, denote a subset of Yt? which does not contain 
the empty set, but which is such that the product of the sets in € is the null 
set. We observe that Hypotheses I and II (see (3.1), (5.1)) are auto- 
matically satisfied in € since + = and D+ if D< KH. 

Finally, set d(X, Y) =s(X,Y), the symmetric difference of X and Y. 
Then d(X, Y) is defined for every pair of elements X, Y in Mt, has its values 
in ®, and has the properties (3.2), (3.3), (3.4), (3.5) by (7.2). The 
results in sections 4 and 5 can therefore be applied in Mt. 


(8.1) THEOREM. SM is a complete space. 


Let X;,¥2,: - - be a Cauchy sequence; then for every EF in € there 
exists an n(H#) such that s(Xn,Xn) GF, that is, d(Xm,Xn) < E#, for 
m>n=n(#£). Then from (7.7) the sequence has the limit (X,X2-: - +) + 
+--+: or (Xi, + ++, these two 
elements being equal. Since Yt is a o-system and a 8-system, the limit of 
the sequence is an element in Wi. 

If € is a sequence of sets H, FE, > FE; - - - whose product is the null 
set, Hypotheses III and VI are satisfied. Then by Theorem 5.11 there is a 
metric space M which is equivalent to Yt. A somewhat stronger result can be 
established directly. If X,, X2 are two elements in Yt such that s(X,, X-) is 
not contained in F,, we set D(X,, X2) =1, where D(X,, Xz) is the distance 
hetween the elements X,, X, of M. In general, if s(X,, X2) is contained in 
but not in Fj,,, we set D(X1, =1/2+; if s(X1, X.) | H; for all i, 
we set D(X,,X2) =0. Two elements X,, XY. are defined to be equal in M 
if and only if D(X,,X2) =0. Then the distance between two elements of M 
is positive or zero and symmetric; equality in M is equivalent to equality in 
M; and the triangle inequality holds in the stronger form D(X,,X3) = 
max [D(X,, X.), D(X2,.1;)]. In a sphere about any point in either space 


54 G. BALEY PRICE. 


there is a sphere of the other space and with the same center; hence, points 


of accumulation are the same in 9 and M. 


9. A metric space. In this section let Yi denote the system of all sub- 
sets of the unit interval O21. From (7.2) and the properties of ex- 
terior measure it follows that Yt becomes a metric space M if the distance 
D(X,, Xz) between two elements Y,, in is defined to be ], 
where m,-[A]| denotes the exterior measure of A. Two sets are equal in V/ 
if their symmetric difference is a set of measure zero. 

A sequence X,, Vs,- - + in a metric space M such that } D(Xi, Xi.) 
1 
converges has been called an absolutely convergent sequence by MacNeille 
[10, p. 192]. Every absolutely convergent sequence is a Cauchy sequence; 
conversely, every Cauchy sequence contains an absolutely convergent sulse- 
quence. Thus there is no loss in generality in assuming that all Cauchy 
sequences are absolutely convergent. From the relations in (7.7). the prop- 
erties of exterior measure, and the fact that Yt is a o-system and a 6-system 
it follows that MW is a complete metric space. The limit of the absolutely 
convergent sequence is (XiX2° +--+) + or 
(Y,+ ++, these two sets being equal in J. 
This proof seems to be simpler than those known for a less general result (see 
Nikodym [12, pp. 139-140], Wazewski [17]. and Fréchet [4]). The relations 
(7.7) enable us to generalize easily other results of Wazewski [17]. 

Let Wy, C WM denote the system of Lebesgue measurable sets on 0S r= 1, 
and My; the corresponding subset of /. The metric space My, has been the 
subject of numerous investigations (see Fréchet [4], Nikodym [12]. Szpilrajn 
[15, 16], Wazewski [17]). Since Yty, is also a o-system and a 6-system. it 
follows from (7.7) that M7, is a closed subset of M. It is easily shown that 
M;, is separable (see Wazewski [17]), but it is known that there exist com- 
pletely additive measure functions for which the corresponding metric space 


is not separable (see Nikodym [13]). 


In the space M;, every Cauchy sequence ,,.V2,- - + converges to a limit 
X and has an absolutely convergent subsequence Yy,,.Vn,.° °° such that 
D(lim Yn,, lim Xn,) = 0, D(lim Xy,, = 0. Furthermore, if Xo, 
k->00 k->00 n—>0O 

is a sequence such that D(lim X,, lim 1,) =0, the sequence is a Cauchy 
n> 

sequence. Since the sequence {X,-+ [lim \,,—lim X,]} is metrically the 


same as the given sequence, there is no loss of generality in assuming that the 


sets lim X, and lim X, are the same. From (7.7) we have 


A GENERALIZATION OF A METRIC SPACE. 5d 


s(Xn, lim Xn) 8(Xn, Xni2) +t 
$(Xn, lim Xn) = s(Xn, Xns2) 


Denote the set on the right by Yn. Then Y; 2 Y22° - -, and the product 
of these sets is empty. It follows from the definition of distance that 
D(Xn, lim S m[¥a], D(Xn, lim Xn) Sm[YnJ. Since lim m[Y¥n] = 0, 


it follows that lim XY, =lim X» = lim Xn, and hence that X;, X2,: is a 
Cauchy sequence in Mz. 

Finally, Mz, is convex in the sense of Menger and quasi convex in the 
sense of Blanc (see Menger [11], Blumenthal [2, p. 40], and Blane [1]). 
Let X,, X» be two distinct elements in Mz: we shall show that there is an 
element in M;, which is between X, and If D(X1,X%1+ €0, 
D(X, + X2, X2) 0, it follows from the definition of betweeness, from the 
properties of Lebesgue measure, and from the relations in (7.8) that X, + _Y, 
is in M,, and that D(X,,X, + + D(X, + Xe, X2) = that 
is, that Y, +-Y. is between X, and X,. Suppose next that D(X,,X, + X2) = 
0. Then X, is contained in X, except possibly for a set of measure zero, and 
\’, — X_2 is a set of positive measure. In this case there is a measurable set 
X; which contains VY, and is contained in X, and differs from both by sets of 
positive measure. As before, we can show that ; is between Y, and X;. The 
remaining case, in which D(X, + Xs, ¥2) =0, is similar. It follows from 
general theorems that any two elements of I, can be joined by a segment (see 
Blumenthal [2, p. 41]); the quasi convexity of I/;, is a corollary of this fact. 

Let M be any o-system and 8-system contained in a space in which there 
is an exterior measure; then, as above, Yt can be metricised in such a way that 


it becomes a complete metric space 


THe UNIVERSITY or KANSAS, 
LAWRENCE, KANSAS. 


REFERENCES 
1, KE. Blane, “ Les espaces métriques quasi convexes,” Annales Scientifiques de L’Ecole 


Normale Supérieure, 3d series, vol. 55 (1938), pp. 1-82. 
2. L. M. Blumenthal, “ Distance Geometries: A Study of the Development of Abstract 


4 Metrics,” The University of Missouri Studies, Columbia, Missouri, 1938. 
3. M. Fréchet, Les Hspaces Abstraits, Gauthier-Villars, Paris, 1928. 
4. ———., “Sur la distance de deux ensembles,” Comptes Rendus, vol. 176 (1923), 


pp. 1123-1124. 


G. BALEY PRICE. 


A. H. Frink, “ Distance functions and the metrization problem,” Bulletin of the 
American Mathematical Society, vol. 43 (1937), pp. 133-142. 

H. Hahn, Reelle Funktionen, Akademische Verlagsgesellschaft, Leipzig, 1932. 

F. Hausdorff, Mengenlehre, Walter de Gruyter, 2d edition, Berlin and Leipzig, 1927. 

L. V. Kantorovitech, “ Lineare halbgeordnete Riiume,” Recueil Mathématique, new 
series vol. 2 (1937), pp. 121-165. 

Kuratowski, Topologie 1, Monografje Matematyczne, vol. III, Warszawa, 1933. 

M. MacNeille, “ Extensions of measure,” Proceedings of the National Academy 
of Sciences, vol. 24 (1938), pp. 188-193. 

Menger, “ Untersuchungen iiber allgemeine Metrik,’” Mathematische Annalen, 
vol. 100 (1928), pp. 75-163. 

. Nikodym, “Sur une généralization des intégrales de M. J. Radon,” Fundamenta 

Mathematicae, vol. 15 (1930), pp. 131-179. 
——, “Sur l’existence d’une mesure parfaitement additive et non séparable,” 


Académie Royale de Belgique. Classe des Sciences. Mémoires, vol. 17 (1938), 
fascicule 8. 

W. Sierpinski, Introduction to General Topology, University of Toronto Press, 
Toronto, 1934. 

E. Szpilrajn, “ The characteristic function of a sequence of sets and some of its 
applications,” Fundamenta Mathematicae, vol. 31 (1938), pp. 207-223. 
. “On the space of measurable sets,” Annales de la Société Polonaise de 
Mathématique, vol. 18 (1938), pp. 120-121. 

T. Wazewski, “ Sur les ensembles mesurables,” Comptes Rendus, vol. 176 (1923). 


pp. 69-70. 


56 
6. 
8. 
9. 
10. 
11. 
12. 
13. 
14. 
15. 
16. 


ON REPRESENTATIONS OF CERTAIN FINITE GROUPS.* 


By EuGEnE P. WIGNER. 


1. The purpose of this paper is the derivation of a classification of the 
representations of finite groups with special reference to groups which satisfy 
the following two conditions: 


a. Every element is equivalent to its reciprocal, i.e., all classes are 
ambivalent. 

b. The Kronecker (or “ direct”) product of any two irreducible repre- 
sentations of the group contains no representation more than once. 


Groups of this character will be called S. R. groups (simply reducible). 
The symmetric permutation groups of the third and fourth degree, the quater- 
nion group, the three dimensional rotation group, the two dimensional uni- 
modular unitary group are 8. R. groups. The significance of condition a is 
that every representation is equivalent to the conjugate imaginary representa- 
tion. One sees this most easily by assuming the representation to be unitary. 
Then, the traces of reciprocal elements are conjugate complex. They are, on 
the other hand, equal, since they belong to the same class. Thus all traces 
are real and conjugate complex representations are equivalent. 

The groups of most eigen-value problems occurring in quantum’ theory 
are S. R. This is important for the following reason. 

Let us assume that we have two eigen-value problems J/,y, = A.y, and 
Hh, = Ave Which allow the same group: y, and yz shall be defined in diff- 
erent spaces. One often considers then? the “‘ united system ” the wave func- 
tions W of which are defined in the product space of the spaces of y, and y.2. The 
“unperturbed” eigen-value equation is (//,-+ H.)¥=AW. The multi- 
plicity of the eigen-value A = A, + dz is the product of the multiplicities of A, 
and A» The eigen-value A splits up if one introduces a small perturbation 
term into the last equation. If this perturbation allows the same group as 
the original two problems and if this group satisfies the above condition ), 
the characteristic functions of the eigen-values into which A splits can be 
determined in “ first approximation” by the invariance of the eigen-value 
problem under the group. The properties of S. R. groups to be derived here 


* Received May 1, 1940. 
For a more complete discussion ef. e.g. E. Wigner: Gruppentheorie, ete., Braun- 
schweig, 1931. 


5% 


58 EUGENE P. WIGNER. 


give a basis for a suitable normalization of (and numerous relations between) 
these eigen-functions which will be dealt with elsewhere. 

We shall denote the different irreducible representations of a group by 
letters j, k. l, ete. The identical representation (in which the matrix (1) 
corresponds to every element) by 0. The elements of the group will be P, 
Q, k, 8S, T, etc. The rows and columns of the representations will be desig- 
nated by small Greek letters «x, A, w, v, etc. The «A element of the matrix which 
corresponds in the j-th representation to the element F will be denoted by 


| so that one has 
KX 

(1) Ap ] 


The character will be abbreviated to 


(la) == FR]. 


The summation over the indices «, A ete. referring to the rows or columns of 
the representations will always run over all values. The unit element of the 
group will be /, the degree of the representations j is 

(1b) = [y]. 

A star will denote the conjugate complex. The Kronecker product of the 
representations and coordinates to the group element the matrix 
the rows and columns of which are denoted by double indices xu and Av 


respectively. We set 


RV TKR 
(2) Map; w= [2 ] |. 
py 


The character corresponding to the element P is 


(2a) | | RI. 
kn 


The significance of condition b for the groups under consideration becomes 

evident if one reduces the Kronecker product of two representations, i. e. 

brings it into the form in which it appears as the sum of irreducible repre- 

sentations. The matrix by which this transformation can be carried out is— 
apart from some phase factors—uniquely determined. 

It may be useful to write down the well known orthogonality and com- 


pleteness relations for irreducible representations. These are 


ON REPRESENTATIONS OF CERTAIN FINITE GROUPS. 59 
(3a) R]*[k; R] = Lk; C] = 
R 


The summation is to be extended in this and all similar formulas over all 
group elements; 


(4) h=3>1 
R 


is the order of the group. The summation over C in the second part of (3a) 
is to be extended over all different classes, n, is the number of elements of the 


class C. The completeness relations yield 


The summation over j is to be extended over all different irreducible repre- 
sentations, dp,5 is 1 for R = S, zero otherwise, Arg is 1 if R and S are in the 
same class, zero otherwise, nr is the number of the elements of the class of R. 


All representations are assumed to be in the unitary form, i. e. 


The irreducible representations can be classified,* in general, into three groups: 
those which can be transformed into a real form, those which cannot but are 
equivalent to the conjugate complex representation, and those which are not 
equivalent to the conjugate complex representation. In analogy to the notation 
customarily used for the two dimensional unimodular unitary group, we shall 
call the representations of the first kind integer representations. Corre- 
spondingly ¢; = 1 will hold for representations j which can be transformed 
into a real form, ¢; = —1 will hold for half integer representations 7 which 
cannot be transformed into a real form but are equivalent to the conjugate com- 
plex representation. Finally ¢; = 0 if the representation j is not equivalent to 
the conjugate complex of j. According to G. Frobenius and T. Schur ? 


R 
2. The number of square roots of an element R will be denoted by €(P) 


We have 
R,S S S,T 


2G. Frobenius and I. Schur, Berl. Ber. 1906, p. 186. 


) 


60 EUGENE P. WIGNER. 


One can replace S by TF in the last summation and obtain 
2 = > = 8p, TR 
R R,T R,T 


as TRTR = T*? if and only if R=TR"T". For a given R, there will be a 
T such that R = TR“T™ only if R and R” are in the same class, i.e. if R 
is in an: ambivalent class. In this case, the number of 7 satisfying R = 
TR*T* is equal to h/npg, since each of the ng members of the class of FR is 
obtained h/nz times when 7 runs over all h elements of the group. Hence 


(9) > = = h. (number of ambivalent classes). 

R R 
The second summation is to be extended only over the elements of the ambi- 
valent classes. The result thus obtained * holds for every finite group: 


THEOREM 1. The sum of the squares of the numbers of square roots of 
all elements of a finite group is equal to the order of the group, multiplied by 
the number of ambivalent classes. 


All classes are ambivalent in the S. R. groups. Hence 
(9a) > =hn 
R 


holds for these, where n is the number of all classes. 
The number of times the representation 1,is contained in the Kronecker 
product of the representations j and / is given by the equation 


(10) (i, j,k) =X C]* C] C]ne/h 
Cc 


where the summation has to be extended, as in (3a) over all classes. Multi- 
plying (7) by [7; S] and summing over j gives, for (5a) 
= h/ng = 
R 


J 


The Rk? = § equation is satisfied for (8S) group elements FP but R? is in the 
class of S for ns{(S) group elements. 

Lemma 1. The Kronecker product of two integer representations or of 
two half integer representations of a S. Rk. group contains only integer repre- 
sentations; the Kronecker product of an integer and a half integer represen- 


® This must have been known to the authors of Reference 2 since it follows imme- 
diately from a comparison of the last sentence of §4 with the sentence in italics on 


page 201. 


ON REPRESENTATIONS OF CERTAIN FINITE GROUPS. 61 


tation contains only half integer representations. The unitary matrix which 
transforms an integer representation into the conjugate complex form is 
symmetric, that which transforms a half integer representation into the con- 
jugate complex form is skew symmetric.? Hence the unitary matrix S which 
transforms the Kronecker product M of two integer or two half integer 


representations into the conjugate complex form is symmetric: 


(12) SM=M*S; S=—S’. 


If the unitary U brings M into the reduced form UMU-"! = M, 


(12a) S,M, = M*,S8,; S, = U*SU" 


and S, is again symmetric. Since the corresponding parts of M, and M*, are 
equivalent and since M, does not contain any irreducible representation more 
than once, S; is a step matrix just as M, is and every submatrix of S;, 
is symmetric on account of the symmetry of S;. Hence, every submatrix of 
M,, i.e. all irreducible parts of M, can be transformed into the conjugate 
complex form by a symmetric matrix and are integer representations. 

It M is the product of an integer and a half integer representation, S:- 
will be skew symmetric and the same will hold for S, and its submatrices. 
Consequently, all the irreducible parts of M will be half integer representations. 

For a 8. R. group cicjex = 1 if (1jk) is different from zero. 


Since the (ijk) are positive integers or zero, 
(13) (ijk)? = cyejex (ijk). 


The equality sign can hold only if either (ijk) =0, or (ak) =1 and 
= Hence 
(13a) (ijk)? = ciejex (ijk) 


ijk ijk 


and the equality sign can hold only if for all 1,j,% either (jk) =0 or 


g 
(ijk) =1 and cicjcx =1. This is the case, according to the definition of 
S. R. groups and Lemma 1, for S. R. groups and conversely, if the equality 
sign holds in (13a), the group must be a 8. R. group. 


Because of (11), we have 
ciejce j,k) = cicjex[i; C]*[7; C][k; C]ne/h 
ijk ijk 


ijk 


For the left side of (18a) we have, because of (5a) 


) 


62 EUGENE P. WIGNER. 


(i,j,k)? 


ijk 


= > = D> 
R R 


where Ve =h/np is the number of elements which commute with R. Hence 


(13a) is equivalent to 
THEOREM 2. The inequality 
(15) S Dvr 
~ 


holds for every finite group. The equality sign in (15) holds for all finite 
S. R. groups and only these. 


38. The Kronecker product of a representation with itself can be decom- 


posed into a symmetric part 


and an antisymmetric part 


2 KA 


It is easy to see that both B and A form a representation of the group. The 
irreducible parts of both B and A are integer representations in case of S. R. 
groups. The irreducible parts of the B for integer j and the irreducible parts 
of the A for half integer 7 will be called even representations. Conversely, the 
irreducible parts of the B for integer 7 will be called odd representations. This 
notation is taken again from the theory of representations of the two di- 


mensional unitary group. 
THEOREM 3. J/n S. RB. groups no representation can be both even and odd. 


The trace of the symmetric part of the square of the representation / is 


(17) Xj.(R) =} [j > + 4[j; R?] 
and the trace of the antisymmetric part is 
(17a) Xja(R) = — 


The condition that two representations have no common part is that the sum 
of the products of their characters vanish. Theorem 3 is equivalent there- 


fore with the validity of 


ON REPRESENTATIONS OF CERTAIN FINITE GROUPS. 63 
(18) + (hk; A)? —alh; =0 
R 


for all j and /. Since the left side of (18) by its nature cannot be negative 
(it is Sj; where n; and m; are the numbers of times the representation 7 is 
contained in the first and second representation of (18)), the validity of (18) 


for all 7 and /& is equivalent with the vanishing of 
(18a) (Lis RE + RD) (Les 
(h/ne + (h/ne— 
= — 
R 


where (5a) and (11) have been utilized. Now evidently 


S,R 


so that (18a) vanishes on account of Theorem 2. Hence Theorem 3 is valid. 

Of course, the Kronecker product of an even and an odd representation 
e.g., contains, in general, both even and odd representations. It has not been 
shown, either, that every integer representation is either even, or odd. In fact, 
one can easily find a group which has an integer representation which does 
not occur in the square of any representation. A group of this character is 
formed by the elements 1, —1, — a, y, — y, 2, —2, with the multiplication 


rules 2? = y? = 1, 2°? = — 1, vy = — yt = 2,92 = — Y, TY — YY — 


PRINCETON UNIVERSITY. 


» — 
) 


SUBSPACES OF /,,,, SPACES.* 


By F. BoHNENBLUST. 


1. Introduction. For any Banach space B, the following numbers a, 


can be introduced: 
ay = inf || P ||, 


where P is any linear projection of B on any subspace of given dimension ». 
For any Banach space a, = 1 and ad; =1. When B is the Hilbert space, all 
a, = 1. If there exists a base in B, then the a, are bounded. For by a theorem 


of Schauder,’ if +,2%n,° is a base of B, 
| 
the new norm ||| ||| sup || éav || is isomorphic with the original one: 
1 


|| a]| and the projection P,, é&ay, has a norm SC. 


Thus ad = C. 

If separable Banach spaces B exist for which the a, are not bounded, 
the open question, whether or not every separable Banach space admits a base, 
will be answered negatively. In the present paper we take a first step in this 
direction, by constructing simple finite dimensional spaces (of dimension /, 
say) for which every > 1, when 1 m <l. 

THEOREM. Let (é,°--,é&), & real, be a point of an I-dimensional 


linear space, and let 


where: (1) pis a finite real number, p> integer; 
(2) n>2 (21—8); 
(3) the fy(é) are linear forms, fv(é) = dw, 
In general this space will be such that dm > 1, forl1<m<l. 
(‘In general’ means that the dv, must satisfy certain relations, which are 


described in the text). Such a space can be considered more simply as an 


I-dimensional subspace of Up,n, where Jp, is the classical space of elements 


* Received May 3, 1940. 
1K. ¢. Banach, Théorie des opérations linéaires, p. 111. 


64 


SUBSPACES OF Ip.» SPACES. 65 


é& |?]/”, It is in this form that 


(5° °°, with the norm || = [3 


we shall verify our result. 


2. Pliicker Grassmann coordinates. Let S,, be an m-dimensional linear 
subspace of the n-dimensional affine space R, of elements «= {é(v)}, v= 
1,2,:--,n. If the subspace S,, is determined by the elements 


x(n) = (w= 
the values p(v1, v2," *,vm) of the determinants 


E(m, v1) €E(m, vm) 


are the Pliicker Grassmann coordinates of the space S;,. Not all of them vanish, 
and considered as homogenous coordinates, they are determined by Sm inde- 
pendently of the particular choice of the elements x(n). Conversely they 
determine the space S,;, uniquely. The Pliicker Grassmann coordinates satisfy 


9 


furthermore the relations * 
(P) p(n,° + *,m) changes sign when the indices are permuted by an odd 
permutation, 
(It) vo, N) P(r; V45 N) p(n, vs, ) P (v2; V35 N) 

N) P(r; V45 N) 0 


for any four indices 14, v2, vs, vy and any set N of (m-— 2) indices. 
Essentially, these are all the relations the Pliicker Grassmann coordinates 


satisfy. In particular, if p(n, v2,° *,vm) = 41, the Pliicker Grassmann co- 
ordinates 
where v is any index different from 1, v2,° °°, vm, are independent and deter- 
mine S,, uniquely, for example, by the elements x(n) = {€(p, v)} 
( 2) E(p,v) = if Vu's 


All the subspaces S», in A, form thus a locally euclidean, m(n— m)- 
dimensional manifold Yt, or more precisely Yt(n,m). In a neighborhood of 
any point S,, of Wt, a set (1.1) of Pliicker Grassmann coordinates acts as a set 
of euclidean coordinates, and a finite number of such neighborhoods cover the 
entire manifold Yt. QM is an algebraic manifold in the (n” — 1)-dimensional 


2E.g. Sommerville, An Introduction to the Geometry of n dimensions, 


5 


v2" ¥m) = | | 


66 F. BOHNENBLUST. 


projective space P(n™—1) of coordinates p(m,°**,vm). (The number of 
dimensions could easily be reduced, but this will be of no importance to us). 
A subset of Yt will be called an algebraic manifold in MM, if it is an algebraic 
manifold in the projective space P(n”—1). We shall need the following 
result, which follows immediately from the relations between the Pliicker 


Grassmann coordinates of a space and those of one of its subspaces: 


THEOREM 1.1. Jf A is a subset of an algebraic manifold of Yt(n, m), 
whose dimension is Sa, then all the points of M(n,1), (L=m), which 
represent subspaces S; containing at least one subspace S» belonging to A, 
form a subset B which lies in an algebraic manifold in M(n, 1), whose dimen- 


sion is Sa+ (n—l):(l—m). 


3. The sets Go, Let be an m-dimensional subspace of 


kK, and let be its Pliicker Grassmann coordinates. 


DEFINITION 2.1. An index vo will be said to belong to Go, if and only 


if, p(vo, VN) =0 for every set N of (m—1) indices. 


THEOREM 2.1. The index vo belongs to Go, if and only if, the vo-th 


coordinate of every element of S,, vanishes. 


The condition is obviously sufficient. To verify the necessity we observe 
that at least one Pliicker Grassmann coordinate of S,, is different from zero, 
say p(1,2,°--,m)=1. If vw belongs to Go, it must be greater than m and 
the vo-th coordinate of the vectors (1.1) vanish. The same holds true for 


any linear combination of these elements. 


DEFINITION 2.2. Two indices v, and vz of Gy (=the complementary 
set of Go) will be said to be equivalent, v; ~ vs, if and only if, v2, N) = 0 
for every set N of (m — 2) indices. 

This notion of equivalence is evidently symmetric and reflexive. It is 
also transitive: assume vy; ~ vy and v2 ~ vy, but »; and v2 not equivalent. There 
exists then a set N of (m — 2) indices, such that p(, v2,N) =1. The index 
v cannot belong to N and it follows from (1.1) that v should belong to Go, 


which contradicts the implicit assumption that v belongs to Go. 


DEFINITION 2.3. The sets G,,--+-,Gy are the sets into which the 
notion of equivalence of the preceding definition divides the indices of the 
complementary set of Go. The sets Go, G,,° > +, Gy will be referred to as the 


“ type 2 of the subspace 


Since at least one Pliicker Grassmann coordinate does not vanish, we see 


SUBSPACES OF lp » SPACES. 67 


that m =k. Furthermore, if denote the number of indices 
which are contained in the sets Go, G,,: - +, Gx, and if k,,hko,: - - denote the 
numbers of sets among G,,---,G@, (i.e. exclusive of G,)) which contain 


respectively one, two,: - - indices, the following relations will hold: 


No = 0, Mm =I, 1,2,---,k; 


ky thot: 


THEOREM 2.2. A necessary and sufficient condition that two indices v, 
and v2 of Gy be equivalent is that the Pliicker coordinate p(v1, v2) of every 
two-dimensional subspace of Sim be equal lo zero. 

Proof. Let v,; and ve belong to different sets G. There exist v3, v4,° 5 
such that *,vm) =1, and thus vectors x(n) = {&(u,v)} in Sm 
for which 


(2. 2) Ets Vu’ ) —= Opp’. 


The vectors 7(1) and #(2) determine then a two-dimensional subspace whose 
P(11, v2) = 1. This shows the sufficiency. Assume now the existence of a two- 
dimensional subspace with p(v1,v2) =1. There exist in this subspace two 
vectors 2(1) and x(2) satisfying equations (2.2) for p, w’ =1,2. They are 
linearly independent and can be completed by 7(3),2(4),° + -,a(m) to form 
a base for S,,. In addition, we may assume equations (2.2) to be satisfied 
for p > 2, p’ = 1,2 and it is then readily seen that at least one p(1, v2, V) 
of S» does not vanish, i.e. that the indices vy, and v2 are not equivalent. 


Theorems 2.1 and 2.2 imply then the 


THEOREM 2.35. The type of any subspace of a subspace Sm is an aggre- 
gation of the type of Sn; i.e. every set of the type of S» is contained entirely 
in one of the sets of the type of the subspace of S,,. In particular the set Go 


of Sm is contained in the set Go of the subspace. 
We complete this theorem by proving next 


THEOREM 2.4. In every Sn, there exists a two-dimensional (and thus 


of any dimension < m, but > 1) subspace of the same type as Sm. 


Proof. Choose from each set of the type of S;, one index. Let us denote 
these by v,°,- - >, %°. Let = {€(4,v)} be a base of Sm and let - -, 
be independent variables. The Pliicker coordinates of the 
subspace determined by the two vectors and Sop: are bi-linear 


in these variables and by theorem 2.2 none of the p(v1°, v2°),* +, p(m1°, ve’), 


68 F. BOHNENBLUST. 


+, p(v°x1, ve°) are identically zero. There exist such that none of 
them vanishes, and the type of this subspace coincides with the type of Sm. 


THEOREM 2.5. If Sm is of type Go,- and if are 
indices belonging to G.,: - -, Gi respectively, there exist real numbers a(v) 
defined for v in Go, and in the k-dimensional affine space Ry an m-dimen- 
sional subspace T» with Pliicker Grassmann coordinates Km) such 
that 
(1) a(v) forvin Go, 

(2) a(vx°) = 1, 


(3) for any indices vp in G’o 


Vm) = a(vm) Km), Where xp ts the index 
of the set G which contains vp, 


(4) The type of Tm is: Go is void, every other G has exactly one index. 


Proof. Let b= {B(v)} be a vector of S» for which B(v) is different 
from zero when vy lies in G’y. (The existence of such a vector b follows imme- 
diately from the definition of G.). Define, for v in @’o, the a(v) by the 
equations a(v) = B(v)/B(vx°), where vx° is the index corresponding to the 
set which contains v. Let, finally, v; ~ v’; and let +, vm be any set of 
(m —1) indices of Since the vector lies in Sm, and since 1, VN) = 


0, we have the relation 


1. 


By an induction proof, we see that 


2 P (15 V25° 


if vyu~v'». In other words we can put 
p (115° Um) = @(v,) (vm) (Ki, ° Km). 
If the v, differ from each other and happen to be chosen among the indices 


*v«°, all the corresponding « are equal to one and p(- - =q(- °°). 


Since, furthermore, at least one of these p does not vanish, the numbers q are 


the Pliicker Grassmann coordinates of a subspace 7’ in Ry. The statements 
(1), (2), (3) of the theorem have thus been verified. To verify the last one 
we remark first, that if g(x, K) is always equal to zero for any K, the co- 


| 


SUBSPACES OF lp,» SPACES. 69 


ordinates p(vx«°, NV) will all be zero, which is impossible since vx° does not 
belong to Go. Thus the set Go of Tm must be void. Secondly, if q(x, x2, K) = 
0, for any K, we have similarly p(vx,°, v«,°, = 0 which implies 
vx,°~ v«,°, Whereas they belong to different sets G. Thus the other sets of 
the type of 7» can contain only one element. 

We remark in passing that the converse of the theorem is also true, and 
also that a vector x = {€(v)} belongs to S,,, if and only if, it is of the form 


E(v) = 0 if ve 
E(v) = a(v)- n(x) if v in Go, 


where « corresponds to v and y = {n(x)} is a vector of Tn. 

The coordinates of any subspace S,» of a given type Go,- - +, Gx are thus 
expressed as polynomials in terms of + 2h; ++ variables 
a and certain g which lie in the m(k— m)-dimensional algebraic manifold 
M(k,m). These subspaces S» will lie, therefore, in an algebraic manifold 
A(m, Gx) in Yi(n,m) of dimension S n—n)—k-+m(k—m). 
The union of algebraic manifolds is an algebraic manifold, and thus for a given 
integer k, the S» which are of a type @,- - +, Gs will lie in an algebraic 
manifold A(m,k) of dimension n—k + m(k—m). 

Let / be an integer = m. Define B(m,1) as the union of those A(m, k) 
for which (m—1)k < m(n—1) + m?—vn, with the understanding that B 
is void if there are no / satisfying this inequality. By theorem 1.1, the points 
8S; of Mt(n,1) which are subspaces S; containing at least one subspace Sn 
which belongs to B(m,1), will lie in an algebraic manifold C(m,1) of di- 
mension < m(n—l) + (n—l)(l—m). The dimension of the union 
C(l) =C(2,1) +---4+C(L1) is <l(n—l1), i.e. less than the dimension 
of 7). 


DEFINITION 2.4. A subspace S; in Ry will be said to be in general 
position, if all Pliicker Grassmann coordinates of S; are different from zero and 
if 8S: does not lie in C(1). 


The existence of S; in general position follows from our consideration of 
the dimension of C(/) and the fact that the S;, for which one Pliicker Grass- 
mann coordinate vanishes, form also a manifold of dimension < 1(n —1). The 
construction of C’(/) allows us to state the following theorem. 


THEOREM 2.6. Let S; be a subspace of Ry in general position. The 
number k of the type of any subspace Sm of Si (2S mZSI—1) satisfies the 
inequality 

(m—1) k= m(n—l) + m?—n. 


70 F. BOHNENBLUST. 


=k, +n we obtain from the 


— 


Since 24 S ky + (hi + 2h. + 3h; + 
last theorem that 


2m 
=n (1—m). 


If we assume furthermore that n > 2(2/ — 3), the last inequality takes on the 


form, substituting for m/(m—1) its maximal value 2, 


k, > 4m —6 =m + 3(m—2) =m. 
In other words we have obtained the following result. 

THEOREM 2.7. If S; is in general position in an affine space of dimen- 
ston > 2(21— 3), the number k, of the sets ‘of the type of any subspace of 8, 
(whose dimension is at least 2), which contain only one index, is greater than 
the dimension of the subspace of S1. 

THEOREM 2.8. Let p be any real number > 1 and not an integer. Let 
Sm in R, be of the type k =n, 1. e. Go is void, and all other G contain exactly 

one index. If 
y(v) | | sign é(v) 
vanishes for every vector x= {é(v)} of Sn, then every y(v) must be equal 
to zero. 

Proof. By theorem 2. 4 there exists a two-dimensional subspace S, of Si, 
of the same type as S,»,. Let 7° = {&(v)} and a! = {é'(v)} be two linearly 
independent vectors of S., where we assume in addition that €°(v) #0 for 
every vy. Since Gp is void, this is no restriction). For every real A 

| &(v) + AE (vy) sign (€°(v) + A€*(v)) 
vanishes by assumption. For |A| < Min[ | &(v)/é'(v) | ], each term is 
analytic in A and by evaluating the derivatives at A = 0, we obtain for any non- 
negative integer h the relations 

Dy(v)- | &(v) | /E(v) sign = 0. 
Let h assume the values 0,1,2,:--°.. The determinant of the (n+ 1) 
linear relations thus obtained is equal to 


and thus different from zero, since S. is of the type kn. These relations 


admit therefore only the trivial solution. 


SUBSPACES OF 1, , SPACES. 71 


THEOREM 2.7. A necessary and sufficient condition that 


y(v) | | sign E(v) 


vanishes for every element x = {&(v)} of Sm is that 
y(v) | «(v) | 21 sign a(v) = 0 for 1,:--,k. 


vom Gx 


Proof. Substitute for é(v) their expressions in terms of the a and the 


coordinates of a vector of the space 7’m of theorem 2.5 and apply theorem 2. 6 


to the space 7’,,. 


3. Projections of norm one. Let S be a Banach space, such that to 
every element x of S different from 0, there exists only one functional fr of 
norm 1 for which f,(2) =||a||. Let S’ and S” be closed subspaces of S, 
S’ contained in S”. Let P be a projection of norm one of S” onto 8’. If t~0 
is an element in S’ and y any element in 8”, the linear functional g defined in 
S” by g(y) =fx(Py) has a norm equal to one. It can be extended to S 


without increasing its norm. Since g(#) =||a||, we have by assumption 


g(y) =fe(Py), i.e. the relations 
(3. 1) fe(Py) =fr(y) for any «0, 2 in S’ and any y in 8”. 


Conversely, if this condition is satisfied for a projection P of 8S” onto 8’, 
the norm of P must be one. For, let y be any element of S” such that Py is 
not the origin. We have then 


| Py || =frv(Py) =frv(y) S lly || 
It is well known that the J, spaces, and in particular the lp,» spaces, satisfy 
the condition imposed on S, provided 1< p< «. The functional f; is 


given by 


(3. 2) fre(y) = {> | E(v) | sign E(v) n(v)}/ lla | 

where «= {é(v)} and y= {n(v)}, If is of type 

Go, G,- > +, G, and S” = 8S; equations (3.1) and (3.2) show that for every 

y in the (J — m)-dimensional subspace where P = 0 

(3. 3) | E(v) | ?-1 sion -n(v) =0 

(3. 4) | «a(v) | 2? sign «(v) n(v) =0 for «== 1,--° -,k. 
yin ¢ 


8 We assume p not an integer. 


l 

l 

j 


= 


F. BOHNENBLUST, 


Conversely, if Sn is given, if U is the subspace determined by equations 
(3.4), and if V is the subspace spanned by S» and U, there exists a pro- 
The subspaces Sm and U have only the 


origin in common. For, if z is in Sm we have €(v) = 0 for v in G and é(v)= 


a(v) +(x) otherwise. Substituting these values in (3.4) we see that every 
Let P be the projection of 


jection of norm one of V onto S,». 


n(x) must vanish, and consequently that «= 0. 
V onto S,,, which projects U into the origin. For this projection the equations 
(3.1) are satisfied and the norm of P must thus be zero. Hence we obtain 
the following result. 
THEOREM 3.1. A necessary and sufficient condition that there exists a 
L 
projection of norm one of S; onto Sm is that S; be contained in the subspace 


spanned by S» and the subspace U determined by the equations (3.4). 

In the particular case where S; is the entire space FR, it is necessary and 
sufficient to verify that the dimension of U is equal to n — m. 

THEOREM 3.2. A necessary and sufficient condition that there exist a 
projection of norm one of lpn on Sm is that k =m, t.e. that the type of Sm 


contains exactly m sets besides possibly a set Go. 


We return to the general case, where S,, need not be equal to Rn. The 
subspace U lies in the (n —/,)-dimensional subspace W defined by é(v) = 0 
If iss is 


for v in a set Ge (x = 1,2,: - -,%) which contains only one index. 
in general position in R,, its Pliicker Grassmann coordinates are all different 
from zero and the intersection of S; and W is exactly (1—k,) dimensional. 


If there exists a projection of norm 1 of S; onto Sm, the dimension of U and a 
fortiori of W must be at least 1 — m, i.e. k; Sm. Comparing this statement 
with theorem 2.7 we obtain the theorem. 

THEOREM 3.3. Ina subspace S; in general position in lpn (p finite, not 
equal to an integer), where n > 2(21—3), only the identity and projections 


on one dimensional subspaces can have the norm one. 


PRINCETON UNIVERSITY AND 
CALIFORNIA INSTITUTE OF TECHNOLOGY. 


ON GENERALIZED RINGS.* 


By Davin C. MurpocH and OystTErINn Ore. 


The present paper contains an axiomatic investigation of the conditions 
underlying the main theorems in the theory of rings. In many results it is 
related to the axiomatic study of the properties of groups given by Hausmann 
and Ore.’ One considers, to begin with, a system with two operations, con- 
veniently called addition and multiplication. The ideals in this system are 
defined by the ordinary ideal properties and they must also be normal sub- 
systems with respect to the additive operation, normality being suitably defined. 

We consider first the conditions for the ordinary decomposition theorems 
of ideal theory to hold, i. e., the conditions for the ideals to form a Dedekind 
structure. It is found that in order to obtain this property the system has to 
satisfy conditions which are almost equivalent to the group axioms with respect 
to addition. Hence we proceed to the study of generalized rings, which are 
systems forming a group, usually non-commutative, with respect to addition. 
For the ideals in a general ring the Dedekind condition is found to be satisfied, 
provided only that a weak distributive law holds for multiplication. 

Next the conditions for the various properties of ideal multiplication 
are derived. Similarly the axiomatic conditions for the residuals or ideal 
quotients are obtained. One finds that the various properties of the residuals 
fall into classes, each dependent upon some particular property of the ring 
multiplication. 

The present investigation seems useful as it has already been stated, 
because it makes clear the essential conditions for each part of ideal theory. 
This makes it possible as soon as a system with two operations is given, to 
determine which conditions are satisfied. This appears to be a more satisfac- 
tory approach than the usual one, in which each ring generalization is studied 
separately with respect to its ideal properties. Because of the generality of the 
theory, it becomes possible to include well-known systems like the Lie algebras. 
A still more important case is perhaps the theory of ordinary groups. Any ° 
group may be made into a generalized ring by making the group multiplication 
the addition in the ring, and introducing the commutator as the product of 
two elements. One finds then that the ordinary theory of normal subgroups 
becomes identical with the ideal theory in the ring. One also obtains com- 


* Received February 12, 1940. 
1B. A. Hausmann and Oystein Ore: “Theory of quasi-groups,” American Journal 


of Mathematics, vol. 59 (1937), pp. 983-1003. 
73 


\ 


74. DAVID C. MURDOCH AND OYSTEIN ORE. 


mutator products and commutator residuals in the group, and various group 
properties follow directly from the general properties of residuals. 


CHAPTER I. Conditions for decomposition theorems. 


1. Ordinary rings. A ring ft is usually defined as an algebraic system 
which is closed under two operations, addition and multiplication. About these 


two operations one usually makes the following assumptions. 


Ring axioms. 
1. The elements of # form an Abelian group with respect to addition. 


2. Multiplication satisfies the associative and distributive laws. Further- 
more certain other properties like existence of units, cancellation laws or the 
commutative law may be postulated. 

One of the main problems in such a ring is the study of the properties 
of its ideals. An ideal is then defined as a subsystem of the ring which satisfies 


the conditions: 


Ideal conditions. 
1. If the ideal a contains a, and a. then it contains a, + a. 


2. Ifr is any element of # and a any element in a, then if a is a left 
(right) ideal it contains ra (ar). If it contains both ra and ar the ideal is 
said to be two-sided. 

In the following we shall write a6 for the set of common elements of 
the two ideals or even arbitrary sets a and 6. The union av b of two (left, 
right, two-sided) ideals shall denote the smallest ideal of the same type 
containing a and 6. 

Before we proceed let us make a few remarks as to the systematic meaning 
of the preceding ideal conditions. The first condition shows that the ideal is a 


modulus, i. e., a subgroup of the additive group. The set of all such subgroups 


_ form a structure Sy, and this structure satisfies the Dedekind ariom since the 


group is Abelian. 

The second ideal condition implies first that any ideal must be a subring. 
Now all subrings of # form a structure Sz. The elements of Xp belong to Sy 
but Sz is not a proper substructure of Sy in the sense that the definition of 
the structure operations coincide, since the union #t, v Rt. of two subrings is 
the smallest ring containing 9, and 9t, while their union ft, + #, in Sy is the 


smallest modulus containing both. Also Sp will usually not satisfy the 
Dedekind axiom. One sees however that the requirement of the second ideal 


up 


m 


ON GENERALIZED RINGS. 75 
condition is sufficiently strong to make the set of ideals a Dedekind structure 
x, and this structure is a proper substructure of Sy with avb—a-+b for 


any two ideals. 


2. Generalized rings. We shall now turn to the generalizations of the 
ring concept. A very broad extension of the concept is the following. 


Algebraic system with two operations: 


In the system 9 there shall exist two operatons, addition and multiplica- 
tion such that a+ 6 and a-b are uniquely defined elements of the system. 
In this definition no conditions on the operations are assumed. However, 
in this general form there is but little which can be proved for the system. 
We shall therefore, at least for the moment, limit our considerations to a more 


specialized system, which covers most of the important applications: 


Generalized ring. 


A generalized ring is an algebraic system with two operations, addition 
and multiplication, forming a group with respect to addition. 

Let us observe that this definition assumes nothing about multiplication 
except that it exists. A subring may be defined as a subgroup of the additive 
group which is closed with respect to multiplication. The cross-cut of two 
subrings is the set of common elements. It cannot be void since it must 
contain the additive zero element. The union of two rings consists of all 
combinations of sums and products of the elements in the two rings. The 
subrings are seen to form a structure as before. 

The preceding discussion shows that it is natural to adopt the following 


definition of an ideal in a generalized ring: 


Generalized ideal conditions. 
1. An ideal a is a normal subgroup of the additive group. 


2. Fora left (right) ideal r-aCa(a:rCa) for every element r in &. 


Since the cross-cut of two normal subgroups is again a normal subgroup 
it is obvious that the cross-cut a* 6 of two ideals is again an ideal. Next we 
turn to the union av b of the same ideals. This union must contain all sums 
of elements in the two ideals and since a and 6 are normal subgroups of the 
additive group all such sums are of the form a+ 6. Now the second condition 
states that c(a + 0) shall also be in the ideal for any c in #. To create an 
analogy to the ordinary ideal theory and in particular to prove the Dedekind 


relation one is compelled to assume a distributive law of the form: 


Distrisutive Law I. For any elements a, b, c, c(a +b)=a,+), 
where a, and b, belong to the left ideals generated by a and b respectively. 


76 DAVID C. MURDOCH AND OYSTEIN ORE. 


To fix the ideas let us consider the properties of left ideals, hence we shall 
use the Distributive Law I for such ideals. 

There are various special cases of this distributive law which may be 
mentioned separately. One case is c(a + b) =c,-a, + c.*b, where c, and cs 
are arbitrary while a, and b, belong to the ring, or still more specially, to the 
additive group defined by a and b respectively. Another special case is 
c(a + 6) a," + where c2, a; and b, may have the same mean- 
ing as before while the exponents r and s indicate the transformation by these 
elements in the additive group a” =—r—+a—r. 

We can now show: 


THEOREM 1. Ina generalized ring the left (right) ideals form a Dedekind 
structure when the Distributive Law I holds. 


Proof. It remains only to deduce the Dedekind relation 
c® (avb) 


for the ideals. It is obvious that the right-hand side is always contained in 
the left. To prove the converse let ¢ be an element contained both in c and 
avb. Then c=a-+b) or b=—a-+-<c, hence b is contained in both c and b. 

From Theorem 1 and the general theory of Dedekind structures it follows 
that all the ordinary decomposition theorems will hold, in particular the 
theorems on the representation of an ideal as the direct union of direct 
indecomposable ideals, the representation as the union of irreducible ideals, 
the theorem of Jordan-Holder, and the more general refinement theorem. 
It seems remarkable that these theorems can be deduced with such small 
assumptions on the system, since in particular there are no assumptions on 
the properties of the multiplication except the weak Distributive Law I. 

It should be mentioned at this point that the ordinary laws of isomorphism 
do not hold in their regular form in generalized rings without further strong 
conditions of distributivity. If a is an ideal then the additive quotient group 
¥t/a exists and R is homomorphic to t/a with respect to addition, but the 
residue classes in Jt/a do not form a generalized ring since no multiplication 
is defined. Similar remarks apply to the law of isomorphism 


avb/a~ b/ab 


which holds only in the additive sense. 

The laws may be obtained however when certain new distributive con- 
ditions are imposed. If a is a left ideal, then Rt is homomorphic to #/a with 
the elements of 9t as domain of left multipliers, provided: 


ON GENERALIZED RINGS. 


If a, b, and c are arbitrary elements, then a(b +c) =ab + c,, where cy 
belongs to the left ideal defined by c. 


Similarly one finds that 9t/a is a generalized ring to which 9% is homo- 
morphic when a is a two-sided ideal, provided: 


If a, b, and ¢ are arbitrary elements, then 
a(o+c)=ab+a, (6+ cha=ba+cz 
where c, and cz belong to the two-sided ideal defined by c. 


Under the same conditions also the second law of isomorphism will hold 


in its ordinary formulation. 


3. Further remarks. In ordinary ideal theory one is also often inter- 
ested in ideals with operators. There exist certain sets of operators A,B, - - - 
for the ring such that each operator produces a new element A:a from‘a given 
element a. As examples one may take automorphisms, differentiation, etc. 
For a generalized ring one can define similar operations. Usually one wishes 
to consider the ideals which are invariant with respect to these operations 
A:aCa, and one can prove under certain conditions that the structure of 
these ideals will also satisfy the Dedekind axiom. An analysis analogous to 
the preceding shows that this is the case provided the operations satisfy: 


Distributive law for operators. 

For any two elements a and 6} and any operator A one has A: (a+ b) 
=a, +b. where a and b belong to the (left, right) ideals defined by a and b. 

In this connection one should mention that one may of course consider a 
generalized ring as a special case of a group with correspondences or operators. 
Those normal subgroups which are invariant with respect to a set of such 
operators will form a Dedekind structure if the operators satisfy the distributive 
law given above. 

The generalized ring discussed above is not the most general case in which 
one can derive a set of ideals forming a Dedekind structure. For an arbitrary 
system with two operations certain conditions may be obtained for the rules 
of operation insuring such properties of the ideals. 

We shall not go into details of this investigation, but only give the 
following sufficient conditions: 

1. The system shall be a quasi-group with respect to addition, i.e., the 
equations a + b, y+-a=b shall have unique solutions. 


2. An associative law for addition, (a+b) -+c—a-+d, where d 


belongs to the ideal {b,c} defined by b and c. 


—— 
| 


DAVID C. MURDOCH AND OYSTEIN ORE. 


-2 
rere) 


3. The Distributive Law I for multiplication. 

Of course the ideals have to be suitably defined as certain types of normal 
sub-quasi-groups. The Associative Law II is designed to insure the existence 
of co-set expansions in the additive quasi-group. These results may be derived 
by methods similar to those used in the study of quasi-groups by Hausmann 
and Ore.” 


CHAPTER II. Products and residuals. 


1. Products. [rom now on we shall suppose that the system St under 


consideration is a generalized ring. If a,b,c,--- is any set of elements of 
then we shall denote by {a,b,---}:, {a,b,---},, {a,b,---} respectively the 


left, right and two-sided ideals generated by these elements. Any such ideal 
a consists of all elements obtainable from the given ones by successive 
applications of the following three operations: 


A) Addition and subtraction. 
B) ‘Transformation in the additive group. 
C) Multiplication on the left (right, or both) by arbitrary elements. 


Now let a and 6 be two ideals or even only two sets of elements. We then 
define their product as follows: 


Product. The product a- 6 is the left (right, two-sided) ideal generated 
by all products a-b where a Ca, 0 C 6. 

In the following we shall usually assume that a and 6 are left ideals and 
that the product a- 6 is their left product. 

The preceding definition does not correspond to the one usually introduced 


in the ordinary rings. Here one defines: 


Bilinear product. The product a:b of two ideals a and 6 (or two arbi- 
trary sets) is the set of elements of the form c= a,b; where a; Ca, bi Cb. 

The obvious deficiency of this definition is that the resulting set is not 
an ideal except when one makes certain assumptions on the ring operations. 
If, however, the bilinear product is an ideal, it must be equal to the product 
as defined above. 

The bilinear product is an additive group. To make it an ideal this group 
has to be normal and closed under the operation of multiplication on the left 


* Hausmann and Ore, loc. cit. In a paper submitted to the Bulletin of the American 
Mathematical Society Mr. Murdoch has given a more detailed discussion of this case. 


] 


ON GENERALIZED RINGS. 19 


(right) by an arbitrary element of 9. This leads us to the following two 
conditions : 

Normality condition I. For any three elements a, b, ¢ in the generalized 
ring one shall have c + a:b—c = Xajb; where the a; belong to the ideal 


{a}; and the b; to the ideal {b}p. 


Associative-distributive law I. For any two sets of elements 0,,° - -, dn 
and *,¢€n and an arbitrary a one has a(Sbjc;) %b’;c’; where the b’; 
belong to {b,,: -,bn}: and the c’; to the ideal 


This last rather complicated law may be considered as a combination of 
an associative and a distributive law in the same way as the ordinary dis- 


tributive and associative laws may be joined into the single relation 
a( b,c, + = (ab,)ce; + (abz) 


When a is a left ideal and 6 a right ideal the product a:b is seen to be a 
two-sided ideal if the normality condition and the associative-distributive law 
hold in a suitable left and right formulation. 

We shall now indicate a few of the properties of the product. The 


following are direct consequences of the definition: 


l. If bc then a: bac, b-ac-a. 
2, a(bec) Ca-boa-e for any a,b,c. 
3. If ais a left ideal then b-a Ca. 


We shall now turn to the fundamental 
4, Distributive ideal relation. 
(avb)c—a-cub-e. 


The proof of this relation requires conditions on the properties of the 


operations in the generalized ring, namely: 


Distributive law II. Let a, b, c be arbitrary elements and d an ele- 
ment in the ideal {a,b},. Then the product d-c¢ must belong to the ideal 
m= where a; C {a}i, C {b}1, C {ce}. 

If one assumes the special Distributive Law I such that the set of ideals 
becomes a Dedekind structure, then every element in {a,b}, is of the form 
d—=a,-+b, and the Distributive Law II becomes simply (a+b)¢eCm 
where m has the same meaning as before. 

If one supposes that the generalized ring is so restricted that the ideal 
products may be defined as bilinear products, then the corresponding dis- 


tributive law which is required for 4. takes the more explicit form: 


80 DAVID C. MURDOCH AND OYSTEIN ORE. 


Distributive law III. For any three elements a, b, c 
(a -|- b)c = 

A direct consequence of 4. is: 
5. For any two left ideals a and 6 

(avb): (ab) Ca-bub-a. 
One can also determine the conditions on the ring operations which insure: 
6. Associative ideal multiplication. 

(ab)c = a(be). 


Since the conditions are easily derived but slightly complicated in nature 
they may be omitted here. Another important law is: 


7. Commutative ideal multiplication. 


If this law is to hold it implies that for any a and 6, b-aC {---, aibi,- +} 
where a; C {a}:, 6; C {b}, and similarly for the product a-b. When the 
ideal multiplication can be defined by bilinear products one must have 
a:b b- a = Sa’yb’;. 

But in the case of the commutative law one can also use a different 
method and define a product which in all cases is commutative: 


Commutative product. The commutative product of two ideals (sets) is 


To conclude let us say that the ring ®t has a left ideal unit if R-a—a 
for all left ideals a. Furthermore we say as usual that two ideals a and 6 are 
relatively prime if avb=. 


2. Two-sided ideals. For two-sided ideals some of the preceding proper- 
ties may be specified further. It should be noted that the Distributive Law II 
is the only condition which is required to obtain these properties. 

First one has the obvious: 


1. For any two-sided ideals a and b,a:b Cab. 


When this is applied to 5. in §1 one finds 


a-b=D-a. 


ON GENERALIZED RINGS. 81 


2% (avb)(acb) Ca-bub-a Carb. 
From this relation follows immediately : 


3. If a and 6 are two-sided relatively prime ideals and if ®t is a left 


ideal unit, then ac b—a-b. 
Let us say that a decomposition of an ideal in the form 


is direct when the cross-cut of each a; with the union of the remaining is the 


zero ideal. We can then prove: 
THEOREM 1. Let R be a generalized ring with unit ideal and 
(1) 


two direct decompositions of Rt as the union of two-sided ideals. Then the 
two decompositions can be refined in such a way that they become identical. 


Proof. We shall have to assume that the distributive ideal relation 4. 
holds both for left and right multiplication with two-sided ideals. One then 


finds 
= = bia, v bias vu: 
a; = Ra; = b,a; v boaj uv: bna;, 


and both decompositions are obviously direct. When they are substituted in (1) 
the two decompositions become identical. It remains only to show that the 
resulting decompositions are direct, and this follows by repeated applications 
of the lemma: 

4, Ifavbuc=R and boc= {0}, an (buc) = {0} then 


b* (auc) (avb) = {0}. 
Proof. From the first condition together with 3. one finds 
bo (auc) = (avec) uv (avec) b= {0}. 


Theorem 1 implies that the representation of 9t as the direct union of 
direct indecomposable two-sided ideals is unique. 

3. Residuals. We shali now turn to the definition of residuals or ideal 
quotients in a generalized ring. First let a and 6 be arbitrary subsets of %. 


6 


82 DAVID C. MURDOCH AND OYSTEIN ORE. 


We define q; = (a:b); =b\ a as the set of elements g such that g:b Ca 
for any 6 in b. We shall call q: the left residual of a with respect to b. 
Similarly one defines the right residual qr = (a: a/b. When a and 
are sets consisting of a single element we obtain the ordinary quotients. 
Obviously the residual may be a void set in certain cases. 

From now on we shall usually assume that a and b are ideals of some 
kind and in order to insure that the residual is not void we shall impose the 
following rather natural condition: If the unit element of the additive group 
is denoted by 0 then 0:a—a-0=0 for every a in ®t. This condition is 
equivalent to saying that the zero-ideal shall consist of the single element 0. 

In the following let a be a left ideal and 6 a (left, right) ideal. Without 
any conditions on the operations in the generalized ring one can say little 
about the properties of the sets 6 \.a or a 7 6. We shall now impose such 
conditions that the residual b \. a becomes a left ideal. Corresponding to the 
three properties characterizing an ideal one finds three conditions on the 
operations of the generalized ring. To simplify the formulation of these 
conditions, let a, b, and ¢ denote arbitrary elements, while cq denotes a left 
ideal generated by products a-c; where the c; belong to the left ideal {c}. 


THEOREM 2. The residual 6b \.a where a is a left ideal is itself a left 
ideal when the following three conditions hold: 


Normality condition II. (a+b—a)cCep, 
Distributive law IV. (a+ b)cCtavtr 
Associative law I. (ab)e C cp. 


TueoremM 3. Let a and b be left ideals. Then the residual 6 \a is a 
two-sided ideal if also the following law is satisfied: 


Associative law II. (ab)c Cea. 


When the residual is shown to be an ideal one can substitute the equivalent 
definition in terms of ideal multiplication: 
The residual q; = 6 \.a is the largest ideal such that q.:b Ca. 


4. Properties of residuals. We shall now indicate some of the proper- 
ties of residuals. We mention first the properties which hold for residuals of 


arbitrary sets with no assumptions on the ring operations. 


1 RK/a=—a\R—R for any set a. 


ON GENERALIZED RINGS. 83 


2 {0} \ a= if the set a contains 0, 
= void set if a does not contain 0. 
8. If are any sets then aCc\a, a\bOaXec. 
4. c\aob= (c\a)*(c\ 5) 


We shall now turn to certain relations which combine the properties of 
left and right residuals. 


6. For any sets a, 6, and c the relation a 7 6 c implies c (a Bb. 
aZ(b\a)>b, (a/b) \arb. 
8 ((a/b)\a)—a/b, (a/ (b\a)) \a=—b\ao. 


The last relation is a consequence of 7. 
Let us now turn to the more special case where the sets in question are 
ideals. ‘Then one sees immediately: 


9. When a is a left ideal then a 
10. When is a left ideal then b\ a =b\ ab. 
11. For a left ideal a, \ a> (6 v \ a.) 
12. When a and are left ideals, c \(avb-> (c \a) v (c\ 6.) 
13. Ifa is a two-sided ideal b\ aa, a/ba. 
14. If the generalized ring has an ideal unit then a / b= if and 
only if a b. 


When the Distributive Law II, hence the distributive ideal relation 4. § 1, 
holds, then one can obtain the two further properties: 


15. Let a, 6, and be left ideals. Then bue \a= (6 \ a) (¢ \ a.) 
16. bua\a=b Xa. 


Finally one finds if ideal multiplication is associative 


1%. (c\b) \a=cb \a. 

We shall not go further into the properties of generalized rings. After 
the conditions for the principal ideal properties have been established one can 
proceed to define a prime ideal and prove certain decomposition properties 
much in the same way as in ordinary rings. An ideal is nilpotent if a” = {0} 
and the radical is the maximal nilpotent ideal. A generalized ring without 
radical is said to be semi-simple.* 


* From this point on the investigations would follow the structural theory outlined 
by R. P. Dilworth: “ Non-commutative residuated lattices,” Transactions of the American 
Mathematical Society, vol. 46 (19389), pp. 426-444. 


84 DAVID C. MURDOCH AND OYSTEIN ORE. 


CHAPTER III. Applications to groups. 


1. Commutator product. As an important application of the preceding 
theory we shall show that the ordinary theory of normal subgroups in a group 
may be considered as a special case of ideal theory in a generalized ring. 

Let G be a group whose operations are written in additive form, and hence 
a -+ b denotes the ordinary product and 0 the unit element. Any group may 
then be represented as a generalized ring when we define multiplication of 
two elements as their commutator, ao b =a + b6—a—b where 

¢—@a-+b—a, This operation has various simple properties. Obviously 
aob=-0 if and only if a and b commute additively. In _ particular 
ao0—0o0a=0, and aoa=0. In an Abelian group every product vanishes. 


One sees immediately : 


The ideals with respect to commutator multiplication are the normal 


subgroups and every ideal is two-sided. 


Let us now turn to the properties of the ideal product A o B of two ideals 
or normal subgroups A and B. This product consists of all elements generated 
by commutators ao b and since the transform of any such commutator is of 
the form a, 0 b, it follows immediately that the product can be defined in the 
bilinear form. This may also be verified by an explicit formulation of the 
commutator rules in such a manner that the preceding laws are seen to be 
fulfilled. 

From the relation ao b + boa =O follows immediately : 


Any two ideals are permutable, Ao B=BoA. 


It should be noted, however, that ideal multiplication is usually not associative. 


For the commutator product one has the distributive laws 


(a+ 
co(a+tb) =~coa—(cob)oa+cob 
or also 


(a+b) oc=(boc)*+a0¢, co (a+b) =coa-+ (cob)4 


These rules are sufficiently strong to imply all the preceding Distributive Laws 
I-IV and also the other distributive laws mentioned in § 2, Chapter I. As a 
consequence the distributive ideal relation 4. in §1, Chapter IT holds for 
multiplication on both sides and hence all the properties of ideal products are 


valid for normal subgroups. 


ON GENERALIZED RINGS. 85 


The two identities (ao0b)° =a°o0 b’, = (boc*)* show that the 
Normality conditions I and II are also satisfied. Similarly the relations 


(ao0b) oc = (b*°— b) oc = (a+ (—a)") 


will serve to verify the Associative laws I and IT. At this point one might 
also mention the following identity 


ao(b+c)-+b0(c+a)+co(a+b) =0. 


The preceding relations imply the existence of the residual Q = A:B 
as the largest subgroup such that Qo BC A. This might of course have been 
established directly and one also finds: The right and left residuals are equal. 
All the results on residuals which have been obtained previously now apply 
automatically to residuals in groups. 

Now let us turn to the intrinsic meaning of the product of two normal 
subgroups. The product P =.108 consists of products of elements of the 
form ao b and it is contained both in A and in B. If P = 0 the elements in 
the two groups commute. By applying this to the quotient groups one finds: 

The product P = AoB is the largest subgroup such that the elements 
of the two quotient groups A/P and B/P commute. 

From the relation 2. in § 2,Chapter II follows (A9°B)? AoBCAoB, 
and hence: The product -to0B is contained in A9 ‘ and contains the 
commutator group of Ao Bb, 

From the definition it follows that the residual Q = A: B is the largest 
subgroup such that all the commutators of elements of Q with elements of B 
belong to A, and hence to .1° 8B. This means that for any g in Q and b in B 
=b-d where d belongs to B. 

The centre C of @ may be defined as the maximal group such that 
GoC@=0. The necessary and sufficient condition that GoA=—A is that 
there exist no normal subgroup P in @ such that 1/P belongs to the centre 
of G/P. This implies that if there exist no normal subgroups A/B in G 
such that 4/B belongs to the centre of G/B then G@ is an ideal unit and the 
previous theorem on the unique direct decomposition of the group into normal 


components will hold. 


2. Other applications, ©n could also give the application of the pre- 
ceding theory to groups by defining the product in an additive group as the 


transform a & b = b*=a-+b-—a. One finds the relations 


axXa=a, aX (—b) =—axXb, (a+b) 


86 DAVID C. MURDOCH AND OYSTEIN ORE. 


and many others. From the point of view of ideal theory this generalized 
ring is not as interesting as the commutator ring. One finds that the normal 
subgroups are the left ideals while the only right ideal is the full group. 

More interesting are the Lie rings with an Abelian additive group and 
the defining relations for multiplication 


ao(b+c)=—aob+aoc, (b+c)oa=—boa+coa, aoa), 
aob+boa=0O, ao (boc) +bo(coa) +co(aob) =0. 


From the point of view of ideal theory the Lie rings have about the same 
properties as the commutator ring of ordinary groups. The normality con- 
ditions and the distributive laws are trivially satisfied. By means of the last 
relation one finds that the associative-distributive law holds, and hence the 
ideals may be defined as a bilinear product. All ideals are seen to be two-sided 
and ideal multiplication is commutative. Similarly it follows that all residuals 
exist as two-sided ideals having all the properties previously derived. The only 
condition which is not satisfied is the associative law for ideal multiplication. 


YALE UNIVERSITY. 


a 
| 
( 
( 


SERIES EXPANSIONS IN LINEAR VECTOR SPACE.* 


By Orrin FRINK, JR. 


1. Introduction. A familiar problem is that of associating with every 


fo. 
element f of a function space J’ a series expansion > ¢npn, in terms of a fixed 

n=1 
set {pn} of functions of F’, the coefficients {¢,} being real or complex numbers. 
In order to include the more important function spaces as special cases, this 
problem is considered here for a real or complex Banach space. Many of the 
results, however, can be extended to linear vector spaces without a norm. The 
first question that arises is how to select the fixed elements {pn} so that the 
series expansion will be unique. It is clear that if uniqueness is expected, no 
element p» should be a linear combination of the others, and it seems natural 
to impose the stronger condition that no element of {pn} be the limit of a 
sequence of linear combinations of the other elements of {pn}. If this con- 
dition holds, the set {jn} is said to be minimal. It will be seen that this 
condition alone leads to an interesting theory of series expansions. 

Kaczmarz and Steinhaus ([2], p. 264) have pointed out the connection 
between the property of minimality and biorthogonal systems. Hence it is 
not surprising to find that the expansion theory of this paper is just as general 
as the theory of biorthogonal expansions discussed by Banach ([1], p. 106) 
in the sense that a minimal set {p,} is always part of a biorthogonal system 
(Pn, fu}. This fact, however, is a consequence of the expansion theory given 
here, and would be difficult to prove in any other manner in the general case. 
It will be seen that many properties of expansions in terms of a minimal set 
may be advantageously studied in terms of the property of minimality, without 
making any use of biorthogonality. The existence of a set of coefficient func- 
tionals {f,} which together with the minimal set {pn} form a biorthogonal 
system, is a necessary consequence of the uniqueness of the expansion. 

Although no very sharp theorems of convergence or summability of series 
are to be expected in such a general situation, the method of determining the 
expansion coeflicients used here suggests that semiregular methods of summa- 
bility, hitherto neglected, are the natural methods to use with minimal series. 
Semiregular methods, though they may sum a series of numbers to the wrong 
sum, never assign the wrong sum to a minimal series. 

Since the more usual special cases of biorthogonal series are so well 

* Received April 18, 1940. 


87 


88 ORRIN FRINK, JR. 


known, most of the applications given here are to series not usually treated 
as biorthogonal. In particular, the inclusion of complex Banach spaces allows 
of interesting applications to the theory of functions of a complex variable, 


including power series. 


2. Definitions. A real or complex Banach space B is a linear vector 
space whose elements may be added together or multiplied by real or complex 
numbers subject to conditions found in Banach ([1], p. 26). To every 
element x of B is assigned a real number | a | called the norm of z, so that 
(1) |@|=0, where @ is the zero element of B, (2) |a|>0 for 78, 
(3) |e+y|{S|e]+]y], and (4) |ar|—|a]-|a|. Convergence of a 
sequence of elements to a limit is defined in terms of the distance | «— y 
A Banach space is complete; that is, every sequence having the Cauchy prop- 
erty converges. Elements of B will usually be denoted by Roman letters, and 


real and complex numbers by Greek letters. 

The linear extension A” of a set A of elements of B is the set of all 
(finite) linear combinations of elements of B. The closed linear extension 
A of A is the set of all limits of linear combinations of elements of A. The 
set A® is the smallest Banach space which contains A and is closed in B. 

A sequence P of elements {p,} is said to be minimal if no element of P 
is the limit of a sequence of linear combinations of the other elements of P, 
that is, if p,€(P?—pn)© for every ». An equivalent condition is that 


(P—jpn)© APE for every n. 


3. Minimal series. 

THEOREM 1. Jf P= {p,} is a minimal sequence of elements of a real 
or complex Banach space, and xe P©, then for every n there exists one and 
only one real or complex number &, such that («—€npn) € (P — pn)®. 

It is no restriction to assume n= 1. Since we a sequence zn 


exists such that 
m 
J 
(1 ) >, 
The sequence {%,;} is bounded, for otherwise it would have a subsequence 
{8} such that 


] 
Bras Br 


Then denoting by {w,} the corresponding subsequence of {zm}, it would 


(2) lim 
I. 


follow that v;,— p,, where 


(3) = Py — (Warr — We) / (Brrr — Br). 


SERIES EXPANSIONS IN LINEAR VECTOR SPACE. 89 


sut the coefficient of p, in vz is zero, which is contrary to the assumption that 
P is minimal. Hence {%n:} is bounded, and a subsequence {y,} can be selected 
from it which converges to some number €,. Let the corresponding subse- 


quence of {Zm} be {ur}. Then 


Ur —> x, and &:pr. 


Subtracting gives 
ps, the number é, has the property required by the theorem. If & were not 


> Since ur—yrp, does not contain 


unique, there would exist two sequences {s,} and {¢,} of linear combinations 
of (P — p,) such that s, — and t, > —mpi, Where Am. Then 
(sn — tn) / (m1 6&1) contrary to the assumption that P is minimal. 
This completes the proof. 

According to Theorem 1, if P is minimal every element 2 of P° has 


assigned to it a unique sequence of numbers {&,}. These are called the 


OO 
expansion coefficients of x in terms of P. The formal series } &,p, formed 
n=1 


with these coeflicients is called the series expansion of x in terms of P. The 
relation between «2 and its series expansion is denoted by writing @~ Sé,pn. 
For n fixed, the dependence of on x is denoted by = The func- 
tionals 1, +. are called the coefficient functionals associated 


with the minimal set P. 


TrEorEM 2. The coefficient functionals f,(a) associated with a mini- 


mal set P are additive, that is f, (aa + By) = afn(v) + Bfn(y). 
This is an easy consequence of Theorem 1 and the definition of f, (a). 
THEOREM 3. If P= {pn} ts minimal and z,—->2, where ae P© and 


Zn == Garpr, then, for every r, lim where & —f,(x) is the ez- 
1 


r NIX 


pansion coefficient of «in terms of P. 

The proof is similar to that of Theorem 1. The sequence {@,,} for r 
fixed is bounded, and if any subsequence of it converges to a number 2, then 
z% has the property of €- in Theorem 1. Since €, is unique, any convergent 
subsequence of {%n,-} converges to &,-, hence the sequence itself converges to é,. 

Theorem 3 shows that all the expansion coefficients of a can be deter- 
mined from any sequence of elements of P” converging to z. As an illustration, 
from any sequence of polynomials converging uniformly to a real function «(¢), 
continuous on an interval [a,b], can be determined the expansion coefficients 
of w(t) in terms of any set of polynomials minimal on [a,b]. Such sets 
include all relatively orthogonal sets of polynomials for the interval [a, 6] or 


any smaller interval. 


90 ORRIN FRINK, JR. 


If P = {pn} is minimal, we shall denote by Pn the set {pni1, Paso,’ * * }; 


n 

and if ze P°, we shall denote by s, the partial sum > é-p, of the series 
r=1 

expansion of z in terms of P. 


THEOREM 4. If P is minimal and xe P°, then r—Sne(Pn)©. If 2 


n 
ts any linear combination > arp, of the elements px, po,***, Pn such that 
r=1 


Zn Sn, then —Zne€ (Pn)°. 


There exist numbers &m, such that 


m 
(4) lim Gmrpr = 2. 
mo r=1 


By Theorem 3, 


n n 
(5) lim Amr pr = > 
r=1 r=1 


Subtracting (4) from (5) gives 
n 


(6) lim Gmrpr = — érpr. 


m—>oo n+1 r=1 
This proves the first part of the theorem. The second part follows from the 
uniqueness of the numbers &, established in Theorem 1. Theorem 4 shows 
that in one sense, the partial sum s, is the “best possible” approximation 
to x of all linear combinations of ji, pn- 


4, Normalization. Theorems 1 to 4 hold in linear vector spaces more 
general than Banach spaces, in which convergence is defined independently of 
the notion of the norm | x | of an element z, since no use was made of the 
norm in the proofs. It was shown in Theorem 2 that the coefficient functionals 
are additive. To prove that they are also continuous, it is convenient to make 
use of the norm, and in particular to consider ways of normalizing the elements 
{pn} of a minimal set P. If %, ~0, the set Q = {npn} is also minimal if ) 
is, and the theory of series expansions in terms of P and Q are equivalent. 
since if énpn, then &~ %npn. One way to normalize P would 
be to demand that | p, | 41 for every n. For certain purposes it is more 
convenient to define normalization in terms of the property of minimality, as 
follows. 

Definition. If P = {py} is minimal, then pn is defined to be the minimum 


distance from p, to the closed set (P — jmn)©. Since pn is not a member of the 
set (P — pn)©, this distance p, is not zero. It can be seen that pp is equal to 


the greatest lower bound of | p, —z | for all z in (P — pn)”. 


rr 


if 
is 
is 
f 
| 
t 
a 
al 
1 
| 
( 
( 
( 
( 
( 
] 
t 
{ 


SERIES EXPANSIONS IN LINEAR VECTOR SPACE. 91 


Definition. The element p, of the minimal set P is said to be normalized 
if pn» = 1. P is said to be normalized if all its elements are normalized. If P 
is minimal but not normalized, the equivalent set Q = {qn}, where gn =pn/pn. 
is normalized. It is interesting to note that any orthonormal basis P = {p,} 
for Hilbert space is normalized in the sense of the above definition. 


n 
THEOREM 5. If P = {pn} is normalized and minimal, then | > arp, | = 
r=1 


| forlsSksn, 


n 
Suppose | Sa-pr|<|ox|. Then a0, and dividing by a, gives 
r=1 


n 
| pr —S Brpr | <1, where B, and — for r4k. This con- 
r=1 
tradicts the assumption that p, was normalized. 
THEOREM 6. The coefficient functionals {fn} associated with a minimal 
set P = {pn} are continuous; that is, if t,—> 2, where tm and x are in P¢, 


and émn =fn(@m) and é:=fn(x) are the n-th expansion coefficients of 


and x in terms of P, then lim &nn = én for every n. 


In the proof we may assume that P is normalized and n 1. Givene > 0, 
it is sufficient to show that there exists an N such that if m>WJN, then 
| Emr —€&, | < 3. By Theorem 1 there exist elements zm and z which are 
linear combinations of {po, ps,° } such that 


(7) | Emi P1 + 2m — Tm | <6 
lépite—a| <e 


Since 2» — x, there exists an N such that for m > N, 


(8) <e. 
Combining (7) and (8) gives 
(9) | — 1) + w | < 8e, 


where w= %»—2z. However, by Theorem 5, 


(10) | (€&m1 — &1) pr + |= | Sit 
Hence | én1 — é, | < 3¢ for m > N, which was to be proved. 


5. Biorthogonal systems. It has now been shown that the coefficient 
functionals {f,} associated with a minimal set P = {pn} are additive con- 
tinuous functionals on the Banach space P©. Such functionals are called 
linear, and they belong to the Banach space Q conjugate to P°, consisting of 
all linear functionals on P©. The norm | f | of such a functional f is defined 
to be the least real number M such that | f(x)|SM|e| for all re PC 


92 ORRIN FRINK, JR. 


(Banach [1], p. 54). A set {pn, fn} consisting of a sequence of elements {pn} 
and a sequence of functionals {f,} of a Banach space B is said to be a bi- 
orthogonal system if fm(pn) =8mn. In terms of such a biorthogonal system 
fe @) 
every element x of B has the biorthogonal series expansion « ~ fn (2) pa, and 
n=1 
every linear functional f on B has the series expansion f ~ ¥ f(pn) fn. 
n=1 
THEOREM 7%. Jf P = {pn} is minimal, and the associated coefficient! func- 
tionals are {fn}, then the set {pn, fn} is a biorthogonal system, and the minimal 
sertes expansion of an element x of P© in terms of P is the same as the bi- 
orthogonal expansion of x in terms of the system {pn, fn}. Conversely, if 
(pn; fn} ts a biorthogonal system, then the set P = {py} is minimal. 


The first part of the theorem follows from Theorems 2, 3, and 6. To prove 
the second part, suppose {fn, fn} is a biorthogonal system, but ? = {p,} is not 


minimal. Then for some n, lim 2 = pn, where 2» is a linear combination of 


elements of (P? — pn). Since fp is a continuous functional, lim fn (2m) =fn(pn). 
But this is impossible, since fn(Z@m) = 0, and fn(pr) = 1. 

Definition. The set of elements P = {p,} of a Banach space is said to 
be weakly minimal if no element p» of P is the weak limit of a sequence of 
linear combinations of elements of (??—~p,). The set of functionals F = 
{fn} on the space B is said to be weakly minimal if no functional f, is the 
limit in the sense of weak convergence of functionals of a sequence of linear 
combinations of — fn). 


THEOREM 8. Jf {pn, fn} is a biorthogonal system, then the sets {pn} and 


{fn} are both weakly minimal. 


The proof is similar to that of Theorem 7. Of course, if a set is weakly 
minimal it is necessarily minimal. 

Since the space Q conjugate to P© is also a Banach space, the theory of 
minimal expansions in Q is to some extent covered by the theory for P©. There 
is the difficulty, however, that while P© is separable, the conjugate space Q may 
not be separable. If @ is not separable, it is impossible for the linear com- 
binations of the set F = {f,} to be dense in Q. In this case there are two 
possibilities. One may consider minimal expansions only for functionals f 
which are in the closed linear extension F©. Or, since there is always a count- 
able set which is weakly dense in Q (Banach [1], p. 124), one may consider bi- 


orthogonal systems {jn, fn} such that the linear extension F” is weakly dense 


Q, where F = {f,}, and Q is the space conjugate to P©. Although with 


W 
sp 
sil 
th 
at 
En 
( 
B 
pi 
( 
( 
W 
if 
i 


SERIES EXPANSIONS IN LINEAR VECTOR SPACE. 93 


weak convergence of functionals as the definition of limit, Q is not a Banach 
space, the theory of series expansions in Q in terms of a weakly minimal set is 
similar to that given here for Banach spaces. 

Definition. If «e P°, where P = {p,} is minimal, e, (a) is defined to be 
the greatest lower bound of | U— Zn | for all linear combinations z,(#) of 


* Pn}, and €,(a) will be called the nth order of approximation of 
«by P. It is clear that lim e,(x) = 0, and that the bound e,(z) is actually 


attained for a least one Zp. 


THEOREM 9. Jf P = {pn} is minimal and normalized, and &m is the mth 
expansion coefficient of xe P° in terms of P, then | én | S en(x) for all m > n. 
Proof. Let & be any positive number, and let m >n. By definition of 


én(a), there exist numbers «, such that 
(11) |a—DSarpr | +8. 
r=1 


By Theorem 4, there exists a linear combination z» of the elements pm.1, 


* Such that 


m 


(12) | > é-pr + — | 
r=1 


Adding (11) and (12) gives 
(13) | SmpPm Wm | 26, 
where w,, is a linear combination of the elements (? —pm). Hence by 
Theorem 5, | ém | < €n(a) + 28, since P is normalized. Since 8 is arbitrary. 


it follows that | é,, | S«n(#), which was to be proved. 
| I 


THEOREM 10. Jf S Expy is the series expansion of x ¢ P© in terms of the 
n=1 


minimal and normalized set P = {py}, then lim & = 0. 
Ss 


This follows from Theorem 9 and the fact that e.(2) 0. Theorem 10 
is an analogue of the Riemann-Lebesgue theorem. Theorem 9 will be used 


later to derive a sufficient condition for the convergence of the series & npn to x. 
n=1 


THEOREM 11. Jf {fn} are the coefficient functionals associated with the 
uns 


minimal and normalized set P = {pn}, then | fn | = 1 for every n. 


Proof. The norm | f, | is defined to be the greatest lower bound of real 


numbers MM, such that 


(14) fr(x) |S Mp | | 


Dn} 
bi- 
em 
nd 

uf 

ve 

Ot 

of 
). 
0 

if 

l 


94 ORRIN FRINK, JR. 


for all re P°. Now the inequality (14) holds for M, —1, for let x be any 
element of and z be such that | z—z|<8andz= Where & = 
r=1 


Such an element z exists by Theorem 1. Since P is normalized it follows from 

Theorem 5 that | fn(z) |=|&|S|z2|<|2|+8. Since 8 is arbitrary, 

| fn(z) | S| a|, which is (14) with M, replaced by 1. On the other hand, 

(14) will not hold for all ee P° if M, <1. For, since P is normalized, there 

exists for every 8 > 0 an element z of the form pn + > apr, such that | z| < 


1+ 8. It follows from (14) that <M,(1+8). 
Since 8 is arbitrary, Mn 21. Hence | f, | = 1. 


6. Summability of minimal series. The connection between an element 


oo 
xz and its minimal series expansion > é:jn becomes more evident if it can 
n=1 


be shown that the series is summable by some method to z. Theorems 1, 3 
and 4, which show that the expansion coefficients €, are limits approached by 
the coefficients of linear combinations of {pn} which actually converge to z, 
suggest the application of convergence factor methods of summability (C. N. 
Moore [8]) to minimal series. Such a method of summability is called 
regular if it always sums a convergent series of numbers to its actual sum. 
More general methods which are here called semiregular seem appropriate for 
minimal series in a function space or linear vector space. The summability 
theorems proved below also provide an answer to the question of whether or 


co 
not a series >) ¢npn is the expansion of some element x of P°. 
n=1 


The series (1) } a of elements 2, of a real or complex Banach space B 

n=1 
is said to be summable to the element x of B by the method (@mn), if (%mn) 
is a triangular matrix of real or complex numbers, where n ranges from 1 to 


m 
m for m1, 2,:--, and lim } @mn%,—=z2. The series (1) is said to be 
moo n=1 


summable to x by the method (@mn) of infinite range, if (%mn) is an infinite 
matrix of real or complex numbers, where n ranges from 1 to «, provided 


converges for every m, and lim = @. 


n=1 n=1 


A method of summability (%nn) of finite or infinite range will be called 


semiregular if lim @mn 1 for every n. A semiregular method which is not 
moO 
regular may sum a convergent series of numbers to a number different from 


the sum of the original series. In particular, any method of rearranging the 
order of the terms of a series is a semiregular method of summability. Such 


a method is in general not regular, as may be seen from the case of a condi- 


— 
—— 


SERIES EXPANSIONS IN LINEAR VECTOR SPACE. 95 


tionally convergent series of numbers. The fact that rearranging the order 
of the elements {pn} of a minimal set P gives a set which is still minimal 
suggests the advisability of considering semiregular methods of summability 
of minimal series. 


THEOREM 12. Jf the series (1) SExpn is summable by a semiregular 
n=1 


method (amn) of finite or infinite range to the element x of P°, where P= 
{Pn} ts minimal, then (1) is the series expansion of x in terms of P. 


This follows from Theorems 3 and 6, and the definition of semiregular 
summability. 


co 
THEOREM 13. Jf (1) Sénpn is the series expansion of the element x 
n=1 


of P© in terms of the minimal set P = {pn}, and if only a finite number of 
the coefficients &, are zero, then (1) is summable to x by at least one semi- 
regular method (&mn) of finite range. 


Proof. Suppose & 0 fork >n. By Theorem 4 there exists a sequence 


2m —> &, of the form zm = Sn + >} Bmxpe, where s, is the n-th partial sum of (1) 


k=n+1 
and lim Bm; = & ~ 0 for every k > n. Hence 2m = Where = 1 
k=1 
for kn, and = Bmx/& for k > n. Since lim = 1 for every /:, the 


method (mz) is semiregular, which proves the theorem. 

Since, without further assumptions concerning the minimai set P, it may 
happen that two different elements x and y of P© have the same series ex- 
pansion in terms of P, it follows from Theorem 13 that two different semi- 
regular methods may sum the same series (1) to different limits z and y, 
while (1) may converge to still a third limit. An interesting example is the 
following. Consider the real Banach space C of real continuous functions 
x(t) on the closed interval [a, b], where a< —1 and1<b. The norm | z| 
of z(t) is defined to be max | x(¢) | on [a,b]. The Legendre polynomials for 
the interval [—1, 1] are minimal in C. Two functions a(t) and y(t) of C 
which differ only outside the interval [— 1, 1] have the same series expansion 
in terms of these polynomials. If this series expansion has only a finite 
number of zero coeftlicients, it is uniformly summable by different semiregular 
methods to the different functions x(t) and y(¢). 

It may also be seen that the hypothesis of Theorem 13 that only a finite 
number of the coefficients € are zero, cannot be entirely omitted. Consider 
a function x(¢) of the space C which is an even function on the interval 
{—1, 1], but is not an even function on the larger interval [a, b]. All the 


96 ORRIN FRINK, JR. 


odd coefficients in the series expansion of x(t) in terms of Legendre poly- 
nomia!s will vanish. Hence the series expansion of z(t) is not summable to 
a(t) by any semiregular method, since no such method can introduce odd 
polynomials not present in the original series, and a series of even polynomials 
‘an converge only to an even function. 


7. Totality. If P= {p,} is minimal, the set of coefficient functionals 
{fn} associated with P is said to be total if the condition f, (2) = 0 for all n 
implies that 6 for z in P®. (Banach [1], pp. 42, 106). <A condition 
equivalent to totality is that the only element whose expansion coefficients are 


all zero, is the element 6. 


THEOREM 14. If the condition of totality holds for the minimal set P, 
then the elements x and y of P© have the same series expansion in terms of P 


only if x = y. 
THEOREM 15. Jf the condition of totality holds for the minimal set 


P = {pn}, and the series expansion > Enpm of an element x of P& in terms of 


n=1 


P is summable by a semiregular method of finite or infinite range to the element 


y, then x = y. 

This is a consequence of Theorems 12 and 14. Together with Theorem 12, 
Theorem 15 insures that semiregular summability methods will never sum a 
minimal series to the wrong sum. Together with Theorem 13, it provides a 


@) 
necessary and sufficient condition that a series of the form 3S ¢npn be the 


n=1 


minimal series expansion of some element x of P©. Since convergence is a 
special case of semiregular summability, it follows from Theorem 15 that if 
totality is assumed, the series expansion of an element «, if it converges, must 
converge to x. (Banach [1], p. 106). 
8. Absolute convergence. A series } , of elements of a real or com- 


n=1 
plex Banach space B is said to be absolutely convergent if the series of norms 


@) 

> |zn | converges. Since a Banach space is complete, absolute convergence 
n=1 

implies the existence of a sum z of the series. 


THEOREM 16. If P = {pn} is minimal and normalized, and the condition 

of totality holds, and e,(x) is the order of approximation by P of the element « 

of P©, and if the series of numbers > n(x) | pn | converges, then the series 
n=1 


CO 
expansion > Enpn of x in terms of P converges absolutely to x. 
n=1 


SERIES EXPANSIONS IN LINEAR VECTOR SPACE. 97 


This follows from Theorems 9 and 15. Absolute convergence of a series 
in a Banach space implies various types of unconditional convergence (Banach 


[1], p. 240). 


9. Density. The condition of totality is related to the question of 
whether the linear combinations of the coefficient functionals {f,} associated 
with a minimal set P are weakly dense in the space Q conjugate to P¢. 


THEOREM 17. Jf the linear combinations of the coefficient functionals 
{fn} associated with a minimal set P = {pn} are weakly dense in the space 
of functionals Q conjugate to P©, then the set {fn} is total. 


Suppose {f,} is not total, but that on the contrary f,(#) =O for all n, 
for some element x of P© not equal to 6. Then there exists a functional f in Q 
such that f(«) = 1. Since linear combinations of {f,} are dense in Q, there 
exists a sequence {g,} of such linear combinations converging to f in the sense 
of weak convergence of functionals. But gn(#) 0 for all n, whereas 
f(x) = 1, which is a contradiction. 

The case in which the coefficient functionals {fn} are weakly dense in 
the space Q conjugate to P© is an important one in the applications of bi- 
orthogonal series, since biorthogonal expansions then exist for all elements 
of @ as well as of P©. Hence it is of interest that the condition of totality 


holds in this case 


10. Biorthogonalization. Let B be a separable real or complex Banach 
space, and B be its conjugate. Let the sequences {qn} and {hn}, each linearly 
independent, be given such that their linear combinations are respectively 
dense and weakly dense in B and B, and suppose that hn»(qn) #0. Then 
there exists a procedure for constructing a biorthogonal system {pn, fn} of 
linear combinations of {q,} and {hy}. (Kaezmarz and Steinhaus [2], p. 265). 
Let p, = and f,; = B,,h, so that f:(p:) =1. Similarly, let po = %2191 + qo, 
and fs == Boh, + so that f, ( = f.(p,) =0;fe(p.) =1, and in general, 
Pn = + On, fr Burr, so that fin(pn) =fn(pm) = 0 for m <n, 

r=1 
while fn(pn) =1. The ini are sufficient to determine the coefficients 
and Bnr, and it can be seen that Bunn +40. The elements are minimal 


by Theorem 7 and the set {fn} is total by Theorem 17. 


11. Schauder series. <A particularly simple case of minimal series is 
the theory of series expansions in terms of the functions defined by Schauder 
(Kaezmarz and Steinhaus [2], p. 50). Let C be the space of real continuous 
functions x(¢) defined on the interval [a, b], the norm of #(¢) being max 


‘ 


98 ORRIN FRINK, JR. 


| a(t) | on [a, b]. Let {tn} be a sequence of distinct points dense in [a, 0] 
with 7; =a and tj =b. Define p,(¢t) and p.(t) to be linear on [a, b| with 
= p2(te) and pi (tz) = po(t,) = 0. For n > 2, let pn(t) be con- 
tinuous on [a, 6], and linear on the intervals into which the points {#,, 2, ° 

tn} divide [a,b], with Pu(tn) =1 and =0 fork <n. The set P= 
{pn(t)} is minimal in C, and the linear combinations of elements of P, being 
arbitrary polygonal lines with corners at the points {7,}, are dense in C. The 
minimal series expansion of an arbitrary element 2(/) of C in terms of 
Schauder functions converges uniformly to «(/), so that these functions form 
a basis for the space C. The coefficient functionals {f,(a)} depend only on 
the value of x(¢) at three points, and may be expressed in terms of the second 
divided difference 7; ], where /; and ¢; are the points of {¢,, t2,- - -, tn} 
immediately to the left and right of /,. Since the linear combinations of these 
coefficient functionals {f,} are weakly dense in the space C conjugate to C, 
an arbitrary functional of C may be expanded in a minimal series in terms of 
the set {fn}. 

Kaczmarz and Steinhaus state incorrectly ({2] p. 51) that any finite 
number of Schauder functions {p,(/)} for > 2 may be omitted, and the 
remaining functions will still have their linear combinations dense in C. If 
this were true, the Schauder functions would not be minimal. Actually, they 
are part of the biorthogonal system {jn(¢), fn(a2)} and hence minimal by 
Theorem 7. If a finite number of the points {/,} are omitted, then infinitely 


many of the Schauder functions must be redefined. 


12. Minimal polynomials. .\ method of constructing sets of minimal 
polynomials {p,(¢)}, where p,(¢) is of degree exactly n — 1, is the following. 
In the space C of real functions x(/) continuous on the interval [a, b], with 
norm defined to be max | x(t) | on [a, b], let {ha(.r)} be any set of indepen- 
dent linear functionals on C whose linear combinations are weakly dense in C. 
If x, (¢) is the function equal to /”~* on [a, b], the process of biorthogonaliza- 
tion applied to the sets {a,(/)} and {h,(a)} leads to a minimal set {pn} of 
polynomials of the form p, (1) + The associated coefficient 

ral 
functionals {f,} are linear combinations of {h,,}, and their linear combinations 
are weakly dense on C. Hence the set {f,} is total by Theorem 17. In particu- 
lar, if {/,} is a sequence of distinct points dense on the interval [a, b], the 
functionals h»(a) = a(1,) have their linear combinations weakly dense in the 
space C. The corresponding minimal polynomials obtained by biorthogonali- 


zation are the Newton polynomials {p,(¢)} where p,(t) == 1, and = 


SERIES EXPANSIONS IN LINEAR VECTOR SPACE. 99 


n-1 

Il (c—t,-) for n>1. The associated coefficient functionals are f;(7) = 
r=1 

a(t,), and the divided difference operation, for 


n> 1. Thus Newton interpolation series are a special case of minimal poly- 
nomial series. (Walsh [4], p. 53). Applying Theorems 12, 13, and 15 gives 


THEOREM 18. Jf w(t) is a function real and continuous on the terval 


[a, b], and {tn} is a sequence of distinct points dense on [a, b], then 

1) if the Newton interpolation series for x(t) in the points {tn} as 
uniformly summable by a semiregular method of finite or infinite range to a 
function y(t), then x(t) =y(t), and 

2) if only a finite number of coefficients of the Newton interpolation series 
for a(t) are zero, then the series is uniformly summable to x(t) by at least 


one semiregular method of finite range. 


13. Applications to complex variable theory. There are several types 
of complex Banach space which correspond to the space C of real continuous 
functions. For example, let A be a Jordan are in the finite complex plane. 
In the first place there is the Banach space C;, of all complex valued continuous 
functions f(/) defined on K, with the norm of f(¢) defined to be max | f(¢) | 
on A, Then there is the space Ax of functions analytic on AK, with the same 
definition of norm. The question of the form of the most general linear fune- 
tional on these spaces arises when the conjugate spaces are considered. The 
theory is different depending on whether A’ is assumed to be rectifiable or not. 

Again, let B be a simple closed curve in the finite complex plane, and 
let G be the interior of B. Consider the Banach space @x of all functions f(z) 
holomorphic in G and continuous in G + B, with the norm of f(z) defined to 
be max | f(z) on B. Again the question of the form of the most general linear 
functional on the space @p arises, and again the treatment differs depending 
on whether B is rectifiable or not. In either case, however, Walsh has shown 
([4] pp. 36, 39) that the set of all polynomials is dense in all three spaces Cx, 
Ax, and @p. Hence in all of these spaces there is a theory of the series ex- 
pansion of a function in terms of a set of minimal polynomials. This theory 
includes as a special case the theory of expansions in a series of polynomials 
relatively orthogonal on the are AK or the curve B with respect to a weight 
function. It also includes the theory of polynomial interpolation series. In 
each case the results of this paper concerning semiregular summability apply, 
and lead to theorems of the type of Theorem 18. 

In all three Banach spaces there is the theory of Newton interpolation 


series in a set of distinct points {¢,} everywhere dense on the are K or the 


h 
l- 
yf 
n 
(| 

f 


100 ; ORRIN FRINK, JR. 


curve B. In the space Ax there is also the theory of interpolation in a set of 
points not necessarily distinct or dense in K. In the space @z there is the 
theory of expansions in polynomial interpolation series in points of the interior 
G of the curve B not necessarily distinct (Walsh [4]). In particular, power 


series are of this type. 
THEOREM 19. If the function f(z) is holomorphic on the interior G of 
a simple closed curve B of the finite z-plane, and f(z) is continuous on G + B, 


and tf the power series expansion (1) } f(a) (z—a)"/n! of f(z) about the 
n=0 


point a of G is uniformly summable by a semiregular method of finite or in- 
finite range to a function g(z) on B, then f(z) =g(z). Furthermore, if only 
a finite number of the coefficients of (1) are zero, the series (1) is uniformly 
summable to f(z) on G by at least one semiregular method of finite range. 


This follows from Theorems 12, 13, and 15, and from the fact that poly- 
nomials are dense on the space that the system {(z— a)", f(™ (a) /n/} is 
biorthogonal, and that the functionals {f'")(a)/n/} are total. 

Of course there are many other types of series expansion in the theory of 
functions of a complex variable to which the theory of minimal series of this 
paper can be applied. From the few examples given here it can be seen that 
the value of treating such series expansions from the point of view of Banach 
spaces is chiefly that this viewpoint suggests new problems and methods, and 
links together subjects that might otherwise seem unrelated. In the case of 
any one type of expansion it is to be expected that special methods will give 


sharper results than any to be derived from the general theory given here. 


THE PENNSYLVANIA STATE COLLEGE. 


BIBLIOGRAPHY. 


1. S. Banach, Théorie des opérations linéaires, Warsaw, 1932. 

2. S. Kaczmarz and H. Steinhaus, Theorie der Orthogonalreihen, Warsaw, 1935. 

3. ©. N. Moore, Summable series and convergence factors, Colloquium Publications 
22, New York, 1938. 

4. J. L. Walsh, Interpolation and approximation by rational functions in the 
complex domain, Colloquium Publications 20, New York, 1935. 


| 

5 ( 

( 

( 


ON LEIBNIZ’S DEFINITION OF PLANES.* 


By HERBERT BUSEMANN. 


1. The well known unsatisfactory Euclidean definitions of straight lines 
and planes have evoked early attempts to replace them by better ones. 
Leibniz? proposed to define a plane as the locus of points which have equal 
distances from two given points, and the straight line as locus of points 
having equal distance from three given points. Since the idea of metric 
space or anything which could be substituted for it had not yet been developed, 
this attempt of Leibnz did not grow into a well-founded theory. The present 
note tries to give a critical investigation of the implications of Leibniz’s 
definition from the point of view of metric spaces. 

Once we have a metric, the straight lines must be the geodesics. To 
furnish something which deserves the name of a plane the requirement has to 
be added to Leibniz’s definition that the locus of those points which have equal 
distances from two given points ts a full geodetic manifold, i.e. that with any 
two points it should also contain the geodesics connecting them. Our purpose 
is to show that this condition essentially already implies the Euclidean or 
Hyperbolic character of the metric. 

We shall see first (section 2) that in order to exclude certain degenerate 
metric spaces one has to assume the space to be convex, externally convex and 
finitely compact. Let these conditions be satisfied and assume furthermore 
that for any two points A,B the locus m(A, B) of those points which have 
equal distances from A and B with any two points X, Y also contains each 
geodesic through X and Y. Then we shall see in sections 3 and 4 that the 
space is homeomorphic to the Euclidean space E, of some dimension n and 
that its metric is Euclidean or Hyperbolic. It is not sufficient to require that 
each segment connecting x and y is contained in m(A, B) unless the condition 
of external convexity is replaced by strict external convexity. 

A motion of a metric space is a single valued mapping of the space onto 
itself which preserves distances. The Euclidean or Hyperbolic geometries 
have the property of free movability in this sense: if the two sets o and o’ are 
congruent then a motion of the whole space exists carrying o into o’. It is 
obvious that under certain convexity and compactness conditions the Euclidean 


* Received April 29, 1940. 

1 Mathematische Schriften, zweite Abteilung 1, p. 166. Later, many other mathe- 
maticians took up Leibniz’s definition. For references see F. Enriques, “ Prinzipien der 
Geometrie,” Enzyklopidie der Math. Wiss., vol. III 1 (1907), p. 19. 

101 


102 HERBERT BUSEMANN. 


and Hyperbolic geometries are the only ones with free movability. The 
question arises whether the nature of the sets o can be restricted. 


One easily derives from our theorem: 


Let an internally and externally strictly convex, finitely compact metric 
space have the property that to any two congruent triples ABC and A’B’C’ a 


motion exists carrying ABC into A’B Then its metric is either Euclidean 


or Hyperbolic. 


This theorem does not claim to be final, since probably further reductions 


of the hypothesis will be possible. 


2. A very simple example of a finite geometry in which Leibniz’s axioms 
hold can be gotten as follows. Let the space consist of n + 2 points P,- 
Png and put =1 for =0. The locus of those points which 
have equal distances from 7 given points will consist of the n-+--2—r 
points different from the r given points. Calling any set of m points an 
(m — 1)-dimensional linear space Lm_;, we see that exactly one Lm, passes 
through any m points, which are in no Ln-2, and that an Dm is the locus of 
those points which have equal distances from n + 2— m fixed points. 

This trivial example shows that some assumption regarding connectivity 
is necessary, but even the strongest assumptions in this repect without convexity 
would not be sufficient to insure the Euclidean or Hyperbolic character of the 


metric. For let e(A, B) be the Euclidean metric of the 2,. Putting 
AB =e(A,B) + log(1 + e(A, B)) 


we have AB+ CB > AB unless C=A or CB. The geodesics of the 
metric are still the Euclidean straight lines (for “ straight line” we shall use 
the abbreviation “s.1.”). The loci m(A, B) are the hyperplanes, which there- 
fore with any two points also contain the geodesics connecting them. We 
therefore require convexity, i.e. that to any two points x,z a point y different 
from x and z exists with 


ry + yz = 7%. 


(If this relation holds for three different points x, y, z, we shall write (ayz)). 
Every convex subset of the /,, satisfies this condition. Therefore external 
convexity must be required, i.e. that with any two points x, y also points v and 
w with (vey) and (xyw) exist. 
3ut this would still admit open convex sets in F, and also certain 


denumerable subsets of /,; to exclude them we have to require completeness. 


In order to also exclude spaces of infinite dimension we require the stronger 


| 
( 
( 
\ 
( 
e 
0 
g 
( 


ON LEIBNIZ’S DEFINITION OF PLANES. 103 


property of finite compactness, or the validity of the Bolzano-Weierstrass prin- 
ciple: every bounded sequence of points has an accumulation point. 

Call “ straight line” (s.1l.) any subset of our space which is congruent 
to a Euclidean straight line. It was proved by Menger ? that under the above 
conditions an s./, passes through any two given points. In particular there 
are segments connecting any two given points. 


We now require for each pair of points K ~L: 


L,: If m(K,L) contains A, B then it also contains each segment connecting 
A and B. 


Then the segment connecting two points is unique. For otherwise two 
points R and S could be found and two segments s; and s2 connecting them, 
which have only the points R and S in common. Let C, and C, be the mid- 
points of R and S on s, and sz respectively. Then m(C,,C.) would contain 
R and S, therefore s, and with it C,. But C, cannot be on m(C,, C2) since 


C,~«C,. We shall designate by RS the unique segment from Ff to S. 

But it does not follow from L, that the whole s.l. g through A, B is in 
m(K,L) as soon as the points A and B are on m(K,L), as the following 
example shows: Take three rays in #, issuing from the point O. For any 
two points on the same ray we put AB equal to the Euclidean distance. If 
A, B are on different rays we put AB = AO + OB. For any two points K, L 
which have not the same distance from QO, the set m(K,Z) consists of one 
point. If KO = OL, the set m(K,L) will consist of the ray which does not 
contain AK or L; hence m(K,L) does not contain the s./J. connecting two 
different points of m(A,L). Therefore we must either require that the 


external convexity is strict: 


8. For any two points x, y and any number r > 0 there exists at most 


one point such thal (cyw) and yw =r, 
or we have to replace Z, by the stronger condition 


L.: With any two points A, B, A ~ B, each s.l. through A, B is contained 
in m(K,L.). 


Our main theorem will then be 


THEOREM 1. 1 convex, externally convex, finitely compact metric space, 
in which conditions so and L, or in which condition Lz 1s satisfied, 1s con- 
gruent to the Euclidean or Hyperbolic space of some finite dimension. 


2“ Untersuchungen iiber allgemeine Metrik,” Mathematische Annalen, vol. 100 
(1928), pp. 73-163, in particular pp. 87 ss. 


104 HERBERT BUSEMANN. 


Since the segment AB is unique it follows from sy immediately that the 

s.l. connecting A and B (if AA B) is unique. We designate it by AB. The 

set consisting of A,B and those points Y for which (ABX) or (AXB) we 

designate by AB. 8s is an immediate consequence of Zo. For if so was not 
—> 


true, four different points A, B, C,, C. would exist with the properties (ABC,), 
(ABC,) and BC, = Then also AC; = AC2, hence m(C,,C.) would 
contain each s.l. through A and B. Since segments are unique, each s. 1. h 
through A and C, must contain B, hence h would also be an s. 1. through A and 
B and therefore contained in m(C,, C2), Ci is not in m(C,, C2). 

We see that Z. contains s) and Z,. We shall show now that s) and J, 
imply Le. 

Assume for an indirect proof that m(A’, Z) contains two points A, B, but 
that there is a point Y» on AB which does not belong to m(K, L), for instance 
Y,.K < YL. Yo cannot belong to AB on account of L,. Let X be any point 
of BL, and Y the point on AX for which 


; XL 
AX + BY, 


For X = B we have Y = Y, and therefore KY < LY, for X = JL we have 
Y —L and therefore KY > LY. As X traverses BL, the point Y traverses a 
Jordan are from Y, to L, therefore X must pass a point X,, such that for the 
corresponding point Y, we have KY, = LY,, or Y; Cc m(K,L). On account 
of L, the segment Y,A belongs to m(K,L). Hence X would be on m(K, L) 
but 

LX, = LB--BX, = KB— BX, < 


We see that both sets of conditions in Theorem 1 are equivalent and imply: 


p: The space is metric and finitely compact; any two different points 
determine uniquely a s.l. which passes through them. For any K ~L the 
locus m(K,L) contains with any two points the whole s.l. connecting them. 


To have a short expression we call a space satisfying this condition p a 
p-Sspace. 
3. We shall prove first that -spaces are linear, in the sense expressed by 


theorems (d), (e) (g) of this section. 

A point F of a set o is called a foot of the point P if PF = greatest lower 
bound of PX as X varies over o. The center R of KL is the only foot of K 
(or L) on m(K,L). For if X is any point on m(K, L) we have 


2KX = KX+XL> KL=2KR. 


ON LEIBNIZ’S DEFINITION OF PLANES. 105 


In particular # is the only foot of K (or L) on every s.l. g in m(K, L) through 
k. Let X AR be any point of g. As X traverses the ray of g opposite to 
RX, the distance KX’ changes continuously from KR to o and therefore X’ 


traverses a position XY’ for which X’K = YK. Since gC m(K,L) we have 
XD = XK = XK = X'L. 


Hence K and L are in m(X,X’) and KL will be contained in m(X, X’). 
Since RC KL we have YR = X’R. We have found 


(a) If g is any straight line in m(K,L) through R (R is the center of 
KL) and if X, X’ are points on g with KX = KX’ then X = X’R. 


We apply (a) to KL ass. 1. in m(X, X’). Taking any two points K’ and 
L’ on RK and RL respectively and with RK’ = RL’ we see that XK’ = XL’, 


so X is a point of m(K’,L’), and X being arbitrary, we conclude that 
m(K’, L’) C m(K,L). In the same way one gets m(K,L) C m(K’,L’). 


(b) Let K and L be any two different points, R their center. If K’ 
and L’ are any two different points on KL, which have R as center, one has 
m(K,L) = m(K’,L’). 


Therefore m(K,LZ) only depends on the s.l. h which carries K and L, 
and on the point #. We shall also use the notation m(h,F) instead of 
m(K,L) or m(K’, L’). 

If all points of an s./. have the point F as only foot in the set o, we call 
h a perpendicular to o. Since K has R as only foot on m(R,h) and 
m(K,L) = m(K’, L’), kK’ will have R as only foot on m(R,h) and on every 
s.l, in m(R, L) through R, hence h is perpendicular to m(F,h) and to every 
s.l. in m(R,h) through R. With the same notations as above we see that g 
is perpendicular to h because h = KL C m(X, X’). We shall prove now 

(c) As R traverses h the sets m(R,h) cover the space simply. 

Let X be any point not on h, F a foot of X on h, and Q any point on LZ 
with VQ > YF. On the ray of hk opposite to FQ, we can find a point Q’ with 
KQ’=NXQ. Therefore ¥ C m(Q’,Q); this shows that the sets m(R,h) 
cover the space. Let F” be the center of OQ; according to the preceding con- 
siderations, VF” is perpendicular to h, therefore /” is the only foot of X on g 
and fF’ = R, or m(Q, Q’) = m(F,h). There can be no surface m(F,h) with 
KF SAF through X, because F would be another foot of X on L. 


(d) An n-dimensional p-space is homeomorphic to En. 


‘ 


106 HERBERT BUSEMANN. 


Proof. The theorem is true for 1-dimensional p-spaces. Assume it to 
be true for s-dimensional p-spaces, s << n, and fix an s.l. h. For RCh, the 
set m(R,h) is a p-space; for if K, LZ are any two points in m(R,h), the set 
m(K,L) of those points in m(R,h) which have equal distances from A’ and L 
is simply the set m(K,L):m(R,h), which also satisfies condition p. Let 
Kk, L be any two points on fh with FR as center. Each KY with YC m(K,L) = 


m(R,h) intersects m(f,h) only at X, otherwise KX would be contained in 
m(K,L). Let Y be the point on KX with YX =KR. The segments YX, 
XC m(R,h) form, topologically, the product # of m(#,h) and a segment, 
hence 

dim = dim m(R,h) +1 


ors = dim S=n—1. Therefore m(R,h) is homeomorphic to the 
It follows from (c) that the whole space is homeomorphic to F,,,, therefore 
s=n—l. 

A finitely compact metric space of dimension d, which with any two 
points also contains exactly one s. /. connecting them, will be called a d-dimen- 
sional linear space La. We shall prove: 

(e) <A d-dimensional linear subspace La of an n-dimensional p-space is 
itself a wspace. A set m(X,Y) either contains La or is disjoint from La or 
intersects it in a (d —1)-dimensional p-space. 

Let K, ZL be any two points in Za. Then m(K,L)- La is the set of all 
points in La which have equal distances from K and LZ. Since m(K,L) and 
La both with any two points also contain the s./. through them, m(K, L) +> La 
does, hence La is a p-space. Let now X,Y be any two points in the whole 
space. If m(X,Y) does not contain Lg and is not disjoint from La, its inter- 
section with La is a linear subspace L of La. L decomposes La into two sets, 
one consisting of those points which are closer to XY than to Y and the other 
of those points which are closer to Y than to XY. (Neither of these sets is 
empty. For since m(X, Y) does not contain La, there is a point A in one of 
the sets. If we connect A to a point C of L, the points of AC on the other 
side of C from A will be in the other set). Therefore Z must have dimension 
d—1.* We shall see next: 


(f{) For a given point Py and a given d=n one can always find s. 1. 


d 
hi,: through Po such that m(Po, hi) is an 
i=1 


*See W. Hurewicz, “ Sur la dimension des produits Cartésiens,” Annals of Mathe- 
matics, vol. 36 (1935), pp. 194-197. 
‘For ZL is a linear space and therefore a u-space. According to (d) it is homeo- 


morphic to a Euclidean space. 


‘ 
( 


ON LEIBNIZ’S DEFINITION OF PLANES. 107 


Proof. This is obviously true for n=1,2. Assume it to be true for 
n—1. Let ha be any s.l. through Po; m(Po, ha) is an Ly. For any line h 
through Py, in m(Po, ha) the set m(Po, ha) -m(Po, h) is the locus m(Po, h) in 


this Zn_:. On account of the inductive assumption we can find lines h,,°- -, 
d 
ha. in m(Po, ha) such that [[ m(Po, hi) is an Dyna. 
i=1 
We can now prove the linearity of the space: 


(g) Through d+1,dSn—1, points +, Pa of an n-dimensional 


p-space, which are in no Lu, with d’ < d, there passes exactly one La. 


Proof. We consider the sets m; = m(Po, PoPi). According to (e) me 
either contains m, or intersects it in an L,-2; we may say that mz, intersects 
m, in an Ln, with n —1=n.=n—?2. In the same way mz intersects Ln, 
in an Ly, with n—1 =n, = n—8 and so forth, finally ma intersects Lng, 
in an Ln, with in ‘this Lng through are 
perpendicular to all PoPi. Now (f) shows that in Ln, lines hy,° - ‘shag 
through P, exist such that the sets m(Po,4;) in this Ly, mtersect only at Po, 
and as a consequence of (c) no m(Po,h;) will contain the product of the 
others. It then follows again from (e) that the sets m(Po,h;) intersect in an 
In-ng, Which contains all PoPi. Since Po,- +, Pa are in no La with d’ < d, 
we must have ny n —d and for the same reason there cannot be two different 


La through these points. 


4. In this section we shall establish the Fuclidean or Hyperbolic charac- 
ter of the metric. For this purpose it will be sufficient to prove that a two- 
dimensional p-space is either Euclidean or Hyperbolic. For any three points 
are in an J», according to the last theorem, and this ZL. is itself a p-space 
(compare (e)) and herefrom one concludes easily that the whole space is 
either Euclidean or Hyperbolic. If the dimension of the space is n = 3, the 
Theorem of Desargues holds in every LZ. and the proof is trivial. But if the 
whole space is two-dimensional the Theorem of Desargues has to be proved 
(at least implicitly) and this accounts for the comparative lengthiness of the 
following considerations. 

We call motion of a metric space any single-valued mapping of the space 
onto itself which preserves distances. Obviously such a mapping is topological. 

Let now ZL be a two-dimensional p-space and g an s.l. in it. To every 
point P not on g there exists exactly one point P’ such that g = m(P, P’). 
For on account of (a), (b) the point P’ can be determined as follows: Let 
F be the foot of P on g, then P’ will be the point on PF with PP’ = 2PF = 


2P’F. We complete this correspondence by mapping every point of g onto 


108 HERBERT BUSEMANN. 


itself. We thus get an involutoric mapping R, of L onto itself, which we call 
a reflection in g. The image o’ of a set o under Ry will be designated by of. 
Our aim is to show that reflections are motions. We fix a definite s.l. g, put 
generally X’ = XR,, and show that Ry, is a motion. The proof will consist 
of several steps and we show first 


(h) If the line PQ intersects g then PQ = P’Q, 

Proof. Let PQ intersect g at S. We have PS=P’S and QS =Q’S 
because g = m(P, P’) = m(Q,Q’). Therefore (h) is true if S coincides with 
Por @Q. If (PSQ) one has 


PY S P'S + = PS + SQ = PQ. 


P and Q being on different sides of g, the points P’ and Q’ will also be on 
different sides hence P’Q’ will contain a point S’ of g, and one concludes in 
the same way that PQS P’Q’. If (SPQ), then 


= 80’ — 8P’ 8Q — SP = PQ 


But in this case we do not know yet that P’Q’ also intersects g. The s. 1. PP’ 
intersects SQ’ in a point R’. If P’ C R’P then P’Q’ obviously intersects g and 
we can conclude PQ => P’Q’ as above. Assume therefore that P’ C R’P. The 
image FR of R’ under R, is on PR’, hence QF intersects g in a point S,. We 
conclude from what has already been proved that QR = Q’R’. We should 


then have 
QS QR RS === OR’ + R’S = Q'S. 


An immediate consequence hereof is: 


(k) Under the reflection Ry an s.l. h which intersects g at S is mapped 
congruently onto an s.l. h’ intersecting g at the same point 8. 


We shall show next 


(1) If an s.l. h intersecting g at S is bisector of two points P, Q (h= 
m(P,Q)) then its image h’=hR, is bisector of the corresponding points 
or 

= Ry. 


Proof. We have P'S = PS = QS =Q’S. If T is on h sufficiently close 
to S the s.l. PT and QT will intersect g in points Z and M respectively. It 
follows from (k) that the images of PL and QM are P’L and P’M and that 
these s. 1. intersect at the image 7” of JT on h’. Therefore P’T’ see PT is OT = 


| 

e 


ON LEIBNIZ’S DEFINITION OF PLANES. 109 


Q’T’ and h’ contains the two points S and 7’ of m(P’,Q’), hence h’ = 
m(P’, Q’). 

Call s. 1. intersecting g s.l. of the first kind. Assume s. 1. of the (n — 1)-st 
kind have already been defined. Then we say an s./..& is of the n-th kind if 
an s.l. h of the (n —1)-st kind intersecting & exists such that kR, (which 
according to (k) is also an s./.) is also of the (n —1)-st kind. Ans. 1. of the 
n-th kind is an s.1. of the m-th kind for all m =n. For v=1 the following 
statement is contained in (k) and (1). 

(m) Under R, an s.1. k of the v-th kind ts mapped congruently onto 
an s.l. k’ of the v-th kind and one has 

Ry RyRy = Ry. 

We assume (m) to be true for n — 1 and we shall prove it for n. 

Since /& is of the n-th kind a s.l. h of the (n —1)-st kind intersecting k 
at a point 7 exists such that 1 = &R; is also of the (n —1)-st kind. Therefore, 
by hypothesis, 7 and h go under Ry into s.1. I’ and h’ of the (n —1)-st kind 
which intersect at the image T’ of T. Ry maps I’ congruently onto a line k’ 
(see (k)), and we have 

= Ry = = Ry kRy. 


Each step means a congruent mapping of the s./. in question, hence & is 
mapped congruently onto k’, and k’ is of the n-th kind because h’ and I’ = h’Ry 
are of the (mn —1)-st kind. Finally let P and Q be any two points for which 
k==m(P,Q). In order to show that K,.R,Ry — R, we have to prove that 
= m(P’,Q’) where P’ = PR,, Q’ =QR,. Now it follows from (1) that 
l= kR, = m(PRz, QRn) and from the inductive assumption that Ri: = Ry 
or Rg =I’ = m(PRiky, QRiR,) and finally from (1) and (m) for n—1 
that 
ke? =VRy = m( PRR Rw, kv) = m(P’, Q’) 

This completes the proof of (m). We shall see next: 

(n) Fvery s.l. is of some finite kind. 

To prove this we need as auxiliary result, that the circles of our metric 
have finite length in our metric. For this purpose it is sufficient to show that 
they are convex curves, since the proof that a convex curve has finite length, 
is identical with the known elementary proof for the same fact in Euclidean 
geometry. 

Let e be any s./., Q a point not on e, F the foot of Q on e so that e= 
m(F,QF). It follows herefrom that as VY traverses one of the rays, into 


which F decomposes e, from F towards infinity, QV increases monotonically. 


110 HERBERT BUSEMANN. 


Therefore one has for any three points A, B,C with (A,B,C) on e the in- 
equality QB < max (QA,QC), and since e is arbitrary, this holds for any 
three points with (A, B,C). Hence the circles with center Q are convex. 
To prove (n) let .g be the fixed line considered before, h, any s. 1. not 
intersecting g. We want to prove that h, is of some finite kind. Let Q be any 
point on h,. We draw the circle y of radius / around Q, and two rays issuing 
from @ which intersect g, they may intersect y in A and B. Let XY traverse y 
from A to B in such a way that all rays QX intersect g, and let then .Y continue 


on y beyond B until it coincides for the first time with a point J7, of h,. Call 
y: the are of y traversed by \ from A to H,. On y, we choose a point Hz. so 
close to H, that H,H,< $AB. Reflecting h, in h,=QHz. we get an s.1, 


h; = QH;, where IH; is still on y,. We have H,H, = II.H, (see (h)) and 
HH», + Hel, = 21H is smaller than the length of y,. If Hz and IH, are 
both between A and B we stop, otherwise we reflect h. in h; and get an s./. 
h, = QH, where HH, is still on y,, 3/4 = H,H, and 3H,H. is smaller than 
the length of y,. Since y; has finite length we shall arrive at a first subscript 
v such that Hy and Hy,, are both between A and B on y;. Putting h, = QHi 
we have 
= hire, 


hy, and hy intersect g, hence hy_, is of the second kind, hy and h,_, being of 
the second kind, hv_s is of the third kind, and so forth. Finally, we see that h, 
is of the y-th kind, q.e.d. (m) and (n) prove that reflections are motions. 

Let now X and Z be any two points which have the same distance from 
a given point S. The points X and 7 divide the circle with center S and radius 
KS = ZS into two arcs, let yo be one of them. From the convexity of yo we 
conclude by continuity considerations that points /,, Y,, Hz on yo in this order 
can be found such that 

XH, = H,Y = YH, = 
Then 
= Z == VRsn, hence Z 


Rsn, Rsn, is a motion which preserves orientation and leaves S fixed, it is 
therefore a rotation which carries V into Z. Since, except for the condition 
XS = ZS the points Y, Z, S are arbitrary, we see that owr metric admits the 
full group of rotations around the arbitrary point S. It is a very special case 
of a well known theorem by Hilbert ® that under these conditions the metric 


is either Euclidean or Hyperbolic. Of course, one does not have to refer to 


5“ Ueber die Grundlagen der Geometrie,” Mathematische Annalen, vol. 56 (1902), 
pp. 381-422. 


j 
> 
: 


ON LEIBNIZ’S DEFINITION OF PLANES. gM ig! 


this theorem; it is very simple to give a direct proof for the validity of the 


congruence axioms. This completes the proof of Theorem 1. 


5. We are now going to discuss the application to motions mentioned 
in the introduction. We consider a finitely compact metric space in which 
for any two points exactly one s. /. through them exists. Assume that for any 
two congruent triples of points A, B, C and A’, B’, C’, (AB = A’B’, BC = BC’, 
CA = C’A’) a motion of the space exists under which A, B, C go into A’, B’, C’ 


respectively. 

Let h be any s./., a motion which leaves all points of h fixed will be called 
a rotation around h. Let A, B be any two points of h, and let C and C’ be 
such that AC = AC’, BC = BC’ (or A+ BC m(C,C’)). We say the metric 
admits the full group of motions around /h if for any four such points A, B, 
C,C’ a rotation around h exists which carries C into C’, or, which amounts to 
the same, if a motion of the space exists carrying A, B,C into A’, B’,C’. We 
can say: 

If for any two congruent triples of points a motion exists carrying the 
first triple into the second, then the metric admits the full group of rotations 
around every straight line. 

Furthermore one sees immediately that a space which admits the full 
group of rotations around every s./. is a p-space. For let A+B and A+B 
C m(C,C’) then AC = AC’, BC = BC’, therefore a rotation around AB 
exists carrying C into (’. Every point VY of AB remains fixed, therefore 
XC = XC’, hence Y C m(C, C’), or ABC m(C,C’). Using Theorem 1 we 


see that our space is the Euclidean or Hyperbolic space. So we have the 


THEOREM 2. Jf ina finitely compact metric space any two points can be 
connected by exactly one straight line and if the space admits the full group 
of rotations around every straight line, it 1s congruent to a finite dimensional 
Kuclidean or Hyperbolic space. These rotations will always exist, if for any 
two congruent triples of points a motion of the space exists carrying the first 


triple into the second. 


The question arises whether it would be sufficient to assume that for any 
two congruent pairs of points a motion exists carrying the first pair into the 
second. For 2-dimensional space this is evidently sufficient; other results make 
it appear likely that it is also sufficient for 3-dimensional spaces; the question 
seems to be open for more than 3-dimensional spaces in spite of the existence 
of Riemann spaces of non-constant curvature in which a given line element 


can be carried into an arbitrary other one by a motion of the Riemann space. 


THE JOHNS HopKINs UNIVERSITY. 


THE AXIS QUADRICS AT A POINT OF A SURFACE.* 


By M. L. MacQueen. 


1. Introduction. The purpose of this note is to define and study two 
quadrics, called axis quadrics, which are associated with each point of a given 
conjugate net on an analytic surface in ordinary projective space. In order 
to formulate a definition, let us consider a point x of a surface S referred to a 
conjugate net N,. The osculating planes of the parametric curves at the 
point x intersect in the axis of the point x with respect to the net N,. The 
osculating quadric along a generator at the point x of the ruled surface of 
axes constructed at the points of the u-curve through the point 2 is the limit 
of the quadric determined by the axis of the point 2 and the axes of two 
neighboring points P;, P. on the u-curve as each of these points independently 
approaches the point z along the curve. The quadric thus defined will be 
called the axis quadric Qy at the point z. A second axis quadric Q, is defined 
similarly by using three consecutive axes of points on the v-curve through 
the point z. We shall now derive the equations of the axis quadrics and deduce 


some of their properties. 


2. Equations. Let the surface S under consideration be an analytic 
non-ruled surface whose parametric vector equation, referred to conjugate 
parameters wu, v, is 


(1) 


The four coordinates « of a point on the surface and the four coordinates y 
of the point which is the harmonic conjugate of the point x with respect to 
the foci of the axis of the point 2 satisfy a completely integrable system of 
partial differential equations of the form ’ 

Cun == px -+ + Ly, 
(2) Luv = + ary + dar, 

Lov = qu + br, + Ny (LN ~0). 
The coefficients of these equations are functions of uw, v and satisfy certain 
integrability conditions which need not be written here. 


* Received June 21, 1940. 
1K. P. Lane, Projective Differential Geometry of Curves and Surfaces, Chicago, 1932, 


p. 138. 


112 


] 
( 


THE AXIS QUADRICS AT A POINT OF A SURFACE. 113 


Tt is easy to verify that 
(3) Yu = fx — nay + sey + Ay, Yr = gu + tay, + nx, + By, 


where we have placed 


=c, + ac + bq — qu, gL = cy + be + ap — ca — pr, 


(4) —nN =a, + « =a, +ab+c—a, 
sN = b, + ab + c— nL = by + — ba— p, 
A =b (log NV) y. B =a— (log 


The ray-points, or Laplace transformed points, p, o of the point a are 


given by the formulas 


= Sy ber, Oo Ty, 
Some of the invariants of the parametric conjugate net are given by 


8 B’ la — 28+ (logr),, 8 = 4b — 24 — (logr)u, 


6) s\ R tL, 
> r—N/L, 
P=f-+as bn, Q=g-+ bt+an. 


We shall suppose that OR ~ 0, so that the parametric curves are not plane 
curves. 
Any point .\ near the point « and on the u-curve through the point wx 


may be defined by the following power series in the increment Au: 


— 


If the points a, ay, a, y are used as the vertices of a local tetrahedron of 
reference, with unit point suitably chosen, then any point given by an expression 
of the form 


has local coordinates proportional to 2,° + -,a;. We find, by use of equations 
(2), that the local coordinates 7,,- - +,7, of the point X are given by the 
expansions 

14 

te = Au + Jadu? +---, 

== tsLAu® +-- -, 


(9) 


The osculating plane of the u-curve at the point V is determined by the points 
] 


8 


(13) 


114 M. L. MACQUEEN. 


10 


Similarly, the osculating plane of the v-curve at the point XY is determined by 


the points X, 1,, Xvv, where 


It is possible to express every derivative of 2 uniquely as a linear combination 
of X, %y, 2, y, so that the power series which represent the local coordinates of 
the points defined by equations (10), (11) are easily obtained and will not be 
written here. Making use of these results, we find that the local equations of 
the osculating planes of the u-curve and v-curve at the point X are respectively 


4 


(12) Dd ar; = 0, 


i=1 i 


‘Me 


== 0, 


1 


where the coeflicients ai, b; are given by the expansions 


a, = — +: 
a2 = 


a, = — sLAu — 4sL[2a + 2b + - 
b, = NAu+:::, 
b, = —N —2bNAu+:::, 


b, =aNAu + 4N[a, + 2ab — 2c4+ -, 
b, = —nNAu— 4N[P + an + 4bn + n(log nNV),yJAu? 


and I is defined by placing 7 = log r. 

The osculating planes (12) of the parametric curves at the point 1 
intersect in the axis of the point X. It is easy to verify that the axis of the 
point X pierces the face z, == 0 of the tetrahedron of reference in the point Y 


whose local coordinates y:,° - -, ys are represented by the series 


yi = 0, 


ys = sLNAu +- 4sLN [2a + 6b + (log 
LN(a+ 3b—1,)Au+:--, 


Any point Z on the axis of the point XY can be defined by a linear combination 


of the form 
(14) Z=hX + kY (h, k scalars). 


Y2 = — nLNAu — 4LN[f —as + 3an + 5bn — nly + n(log nL), +: °°, 


| 


THE AXIS QUADRICS AT A POINT OF A SURFACE. 115 


Jn order to calculate power series expansions for the local coordinates 2;,- - -, 
z, of the point Z, it is sufficient to multiply the series (9) by A and the series 
(13) by & and add corresponding series. Demanding that the equation of a 
general quadric be satisfied by the power series thus calculated, identically 
in h, k and identically in Aw as far as the terms of the second degree, we obtain 
the equation of the axis quadric Qu referred to the tetrahedron x, Xu, Xv, Y, 


namely, 


(15) (f—as—bn+ ny + uNy/N) — 
+- $(Su/s + Nu/N — «) + + = 0. 


The coefficients in equation (15) are not all invariants because the tetra- 
hedron of reference is not a covariant tetrahedron. For the purpose of writing 
the equation of the quadric Q, referred to the covariant tetrahedron 2, p, o, y 
a simple computation shows that it is sufficient to replace 2, in equation (15) 
by 2, — ba.—ax;. If we make this substitution and simplify the coefficients 
hy means of equations (6), we arrive at the equation of the axis quadric Qu 
referred to the covariant tetrahedron z, p, o, y, namely, 


(16) [NWP —4(1D) — + + + = 0, 


where J is defined by placing 
(17) [= (log 9)u + 1€” + Aly. 


The equation of the quadric Q, at the point x can be written immediately 
by interchanging w and v and making the necessary symmetrical interchanges 
of the other symbols. For this result we find 


(18) [LQ + x2? — 2a, 2. + + = 0, 


where J is defined by 


(19) (log + 48’ — 


3. Properties. Some simple properties of the quadrics Qu and Q»y will 
now be deduced. In the first place, it is clear that the quadric Q, intersects 
the tangent plane, in the u-tangent, 7; = and in the line 


represented by the equations 
(20) 292, — — [NP — 4 = 0, 0. 


Likewise, the quadric Q» intersects the tangent plane in the v-tangent, 7. = 


0, and in the line 


116 M. L. MACQUEEN. 


(21) 2x, — [LO + 3D, = 0, 


The line (20) passes through the ray-point p in case J = 0, and the line (21) 
passes through the ray-point o in case J = 0. 

The intersections of the quadrics Qu, Qy with the osculating planes of the 
parametric curves at the point x are of some interest. It is evident that the 
axis, 2, = x, = 0, of the point z is a common generator of the two quadrics. 
The osculating plane, 7, = 0, of the u-curve is tangent to the quadric Q, at 
the point x. Three vertices x, p, and y of the tetrahedron of reference 2, p, o. 
y lie on the quadric Q,. The fourth vertex o also lies on this quadric if, and 
only if, WP Moreover, the osculating plane, = 0, of the 
v-curve at the point x contains two generators of the quadric Q,, namely, the 


axis and the generator whose equations are 


292, — [NP — (7D) ula, — 2nHa, = 0, Xo = 0. 


This line meets the line (20) in the point 


(NP — $(1D)u, 0, 26, 0), 


ras) 


and intersects the axis in the point whose coordinates are 
(24) (n. 0, 0,1). 


Similarly, in the osculating plane x, = 0, we easily find that the generator 
of the quadric Q, that corresponds to (22) intersects the line (21) in the 
point 

(25) (LQ + $D,, 2K, 0, 0), 


and meets the axis in the point 
(26) (— 0, 0,1). 


The points 2 and y are separated harmonically by the points (24), (26). 
Furthermore, the osculating plane of the v-curve at the point x is tangent to 
the quadric Q, at the point (24), and the osculating plane of the u-curve ts 
tangent to the quadric Q, at the point (26). 

The quadrics Q, and Q, intersect, besides in the axis, also in a twisted 
cubic which has the axis for a bisecant. This cubic meets the tangent plane 
in points (23), (25), and in the point of intersection of the lines (20), (21). 
since these points are common to both quadrics. Kliminating .7, between 


equations (16), (18), we obtain the equation of the cubic cone projecting 


from the point 2 the curve of intersection of the quadrics QV», (>. namely, 


| 
| 

i 


THE AXIS QUADRICS AT A. POINT OF A SURFACE. 117 


The form of this equation makes it evident that the axis of the point z is a 
double line of this cone. Moreover, the equation of the nodal tangent planes 
of the cone along the axis is obtained by setting equal to zero the coefficient 
of a, in equation (27). Let us recall that the curvilinear differential equation 


of the axis curves of the net N, is given by 
s du? + 2n dudv —t dv? = 0. 


Thus the following theorem is proved: 


The nodal tangent planes of the cone (27) along the axis of the point x 
are the planes through the avis that cul the tangent plane in the tangents of 


fhe arts curves. 


Furthermore, the cone (27) cuts the tangent plane in the parametric 


tangents at the point a and in the line 
S[LP + 49 RI — R[NP — 3 (7D). — GJ 23 = 0, Ly = 0, 


which joins the point « to the point of intersection of the lines defined by 
equations (20), (21). 

Now let us project the curve of intersection of the quadrics Qx, Qe from 
the ray-point o. The result of eliminating 2, between equations (16), (18) 


is found to be a composite quartic cone, one component being the osculating 


plane, a. = 0, of the v-curve at the point z. This projecting cone meets the 
osculating plane, «= 0, of the u-curve in the axis and in a plane cubic curve 


which intersects the axis in the point a and in the points 
28) yt st 


Since the points defined by the formulas (28) are the foci of the axis, we 


arrive at the following conclusion: 


The quadrics Qu, Qv intersect in the axis of the point x and in a cubic 


curve which intersects the aris in the foci of the avis. 


In his investigation of the osculating linear complexes of the two curves 


+ 2OR (sare? + — tay”) ay = 0. 
) 
a 


118 M. L. MACQUEEN. 


of a conjugate net through a point of a surface, Lane has observed 2 that the 
point p corresponds to the plane 
(29) NIv; + 262, = 0 
in the null system of the osculating linear complex of the u-curve. The plane 
(29) is found to be tangent to the quadric Q, at the point p. Therefore the 
tangent plane of the quadric Q, at the ray-poinl p is the plane which corre- 
sponds to the point p in the null system of the osculating linear complex of the 
u-curve. 

Finally, the polar plane of the ray-point o with respect to the quadrie (), 
has the equation 


( 30) [ NP 4 (7D) u — = 0. 


This plane passes through the point y if, and only if, n = 0, so that the para- 


metric net is harmonic. 


SOUTHWESTERN COLLEGE. 


* E. P. Lane, “ Contributions to the theory of conjugate nets,” American Journal of 


Mathematics, vol. 49 (1927), p. 575. 


A CRITERION FOR SOLVABILITY BY RADICALS.* + 


By B. W. Brewer. 


Introduction. The Galois criterion for solvability by radicals is valid 
in the fields of characteristic zero, but not in those of prime characteristic. 
There is given in this paper a criterion which is valid in any field. This 
criterion emphasizes the importance of the primitive roots of unity and the 


cyclotomic polynomial in the theory of solvability by radicals. 


1. The cyclotomic polynomial in a field of prime characteristic. We 
establish in this section certain properties of the cyclotomic polynomial in a 
field of prime characteristic which are essential to the development. 

An absolutely algebraic field of prime characteristic is uniquely defined 
by its characteristic and absolute degree.?, We shall denote by Ap.» the ab- 
solutely algebraic field of prime characteristic p and absolute degree m. How- 
ever, we shall denote by GF[p"] the finite field of p” elements when it seems 


necessary to emphasize its finite character. 


LEMMA. An irreducible polynomial f(a) of degree n in the Apm factors 
in the Apm, m a divisor of m’, into & distinct irreducible factors each of degree 


n/8, 8 being the greatest common divisor of n and m’/m, 


Proof. The coefficients of f(a) are elements of some GF'| p*| & Apm. Since 
f(a) is irreducible in the Ap,n, f (2) is irreducible in the GF[p*]. From the fact 
that « is a divisor of m, and 8a divisor of m’/m, it follows that GF'[ p*] | Ap. 
Since the lemma is known to be true for finite fields,° f(a) factors in the 
GF into distinct irreducible factors +. each of 
degree n/8. These are the irreducible factors of f(x) in the Ap. To show 
irreducible in the Ap... The coefficients of are elements of 
some GF[p"| Let r= ab, where is a divisor of m, and 6 a divisor 
Let v« be the least common multiple of @ and «, and 28 the 
least common multiple of b and 8 Since v,« is a divisor of m, and 126 
a divisor of m’/m, GF[p"] S GF[p”®] Apm, where v= is Fela- 


, 


* Received March 7, 1939; revised September 1, 1940. 

1 Most of the results obtained in this paper were included in a dissertation for the 
doctorate, University of Missouri (1938). 

The absolute degree is a certain G-adic number, 
Theorie der Kérper, Berlin, Walter de Gruyter & Co., 1930, pp. 79-88. 

77, KE. Dickson, Linear Groups, Leipzig, B. G. Teubner, 1901, p. 33. 


See E. Steinitz, Algebraische 


119 


120 B. W. BREWER. 


tively prime to #, since f(x) is irreducible in the GF[p"*], a subfield of 
the Apm. v2 is relatively prime to 1/8, since vz is a divisor of m’/m, and n/8 
is relatively prime to m’/m. Hence vy is relatively prime to n/8. Thus, since 
f(x) factors in the GF[p*] into the 8 distinct irreducible factors $,(2). ° 
$5(x), f(x) factors in the GF{[p”®| into these same irreducible factors. But 
the irreducible factors of f(z) in the GF |p| are its irreducible factors in 
the Ap.m. Hence f(a) factors in the Ap» into 8 distinct irreducible factors 
each of degree 1/8, and the lemma is proved. 

The set of all absolutely algebraic elements of a field K of prime charac- 
teristic constitutes a field which is absolutely algebraic. We shall call this 
field the maximal absolutely algebraic subfield of K. 

Throughout this paper, g,(a) will denote, for a given positive integer a 
and field K, that cyclotomic polynomial in K, whose roots are the $(n) 
distinct primitive n-th roots of unity, and 7, will denote the root field of 
gn(x) over KY. If K is of prime characteristic p, this imphes that n +0 
(mod p). since if »=0 (mod p), primitive v-th roots of unity over K do 


not exist. 


THeorEeM 1. Let K be a field of prime characteristic p, and m be the 
absolute degree of the maximal absolutely algebraic subfield of K. Lel gy(«) 
be the cyclotomic polynomial in K, whose roots are the (nr) distinel primitive 
n-th roots of unily,ns40 (mod p). Then, if d is the greatest common divisor 
of b(n) and m, and e is the exponent to which p* belongs modulo n, gn(x) 
factors in K into b(n) /e distinct, irreducible, separable factors each of degree e. 


Moreover, the Galois group of gn(x) relative to K is cyche of order e. 


Proof. Since d is a divisor of m, the maximal absolutely algebraic sub- 
Since theorem 


field of K, namely the A,,,,, contains the GF[p*] as a subfield. 
1 is known to be true for finite fields, gn(z) factors in the GF[p*| into 
o(n)/e distinct, irreducible, separable factors each of degree e. But e is 
relatively prime to m/d, and hence it follows from the above lemma that 
these are the irreducible factors of g,(a#) in the Apm. Moreover, these are 
the irreducible factors of g,(a) in K, since the coefficients of the irreducible 
factors of g,(«) in K are symmetric functions of certain of the primitive i-th 
roots of unity. and hence elements of the Ap,». Since the Galois group J/ of 
gn(x) relative to the GF[p#] is evelic of order e, and the common degree of 
the irreducible factors of g,(x) in K is e, it follows from the well-known 


‘H. Rauter, ‘ Héhere Kreiskérper,” Journal fiir die reine und angewandte Mathe- 


matik, vol. 159 (1928), pp. 220-227. 


A CRITERION FOR SOLVABILITY BY RADICALS. 121 


properties of the cyclotomic polynomial that 17 is the Galois group of gn(2) 
relative to K. Hence theorem 1 is proved. 
We have immediately from theorem 1, 


THEOREM 2. Let K be a field of prime characteristic p, and m be the 
absolute degree of the maximal absolutely algebraic subfield of K. Let gn(x) 
be the cyclotomic polynomial in K, whose roots are the (n) distinet primitive 
n-th roots of unity, nxs0 (mod p). Then a necessary and sufficient con- 
dition that gy(a) be irreducible in K is that p be a primitive root of n, and 


p(n) be relatively prime to m.® 


THEOREM 3. Let K be a field of prime characteristic p, and m be a 
composite positive integer not divisible by p. Then where d 
isa divisor of n. Moreover, if d is equal to the product of the distinct prime 


factors 1, Jo, °°. Gr of n, the degree of Ty, over Ta is a divisor of n. 


T,, is well known, and hence we have only to 


Proof. That K 
T, over Ty (denoted by [T,n:Ta]) is a divisor of n 


show that the degree of 
if d == Gr. 

Let m be the absolute degree of the maximal absolutely algebraic subfield 
of K. Since ¢(d) is a divisor of d(7), it follows from theorem 1 that [Ta: K | 
is the exponent e to which p= p'?'-") belongs modulo d. Hence p?=1 
(mod d). Now if «a and 6 are relatively prime positive integers, and a= 1 
(mod b), then « 1 (mod b%) (s—1.2.---). Hence it follows that 
==1 (mod n). Therefore, if n= - the exponent to 
which p belongs modulo is eqi* where < ki 2, 

-,r). Thus from theorem 1, = eqi* and therefore 


Tn: Ta] = a divisor of n. 


2. Acriterion for solvability by radicals. An extension K of the field A 
is said to be pure over AK, if and only if A = K(a), @ being a root of an 
irreducible binomial in A. We then have the 

Definition. \ polynomial f(a) in a field Ko is said to be solvable by 
radicals over Kk, if and only if there exists a chain of fields 

where K; is pure and of prime degree over (i= 1,2.: and Wy 
is the root field of f(a) over Ko. 


>Compare with F. Levi, “ Zur Reduzibilitit der Kreisteilungspolynome,” Com positio 


Vathematica, vol. 2 (1935), pp. 303-304. 


ft 
n 
rs 
is 
) 
) 
) 


122 B. W. BREWER. 


The fact that primitive n-th roots of unity exist and that g,, (a) is solvable 
by radicals over a field of characteristic zero, for every positive integer n, is 
made use of in the proof of the Galois criterion. But primitive n-th roots of 
unity do not exist over a field K of prime characteristic p if n =0 (mod )). 
and if n5£0, (mod p), gn(x) may not be solvable by radicals over K. The 
recognition of these facts leads to the following criterion for solvability by 
radicals. 


THEOREM 4. Let f(x) be a polynomial in a field Ko, and n be the order 
of the Galois group of f(x) relative lo Ko. Then f(x) is solvable by radicals 
over Ky if and only if 


(1) G is solvable, 

(II) primitive n-th roots of unity exist over Ko, and the cyclotomi 
polynomial gn(x) in Ko, whose roots are the @(n) distinct primitive n-th 
roots of unity, is solvable by radicals over Ky. 


Proof. We first make several remarks concerning notation. 


If N is a separable normal extension of finite degree over Ko, and K is 
any extension of Ko, then the root fields over K of all those polynomials in A’, 
which have N as their common root field over A’ are one and the same sep- 
arable normal extension N of finite degree over K. This field N uniquely 
determined by Ko, N, and K will be denoted by {N,K}. Now N € {N,K}, 
kK € {N, kK}, and we shall denote the intersection of N and K by [N, KJ. 

W;, will denote the root field of f(a) over Ko, and M; the maximal sep- 
arable (necessarily normal) extension of Ky contained in W;. Then (@ is by 
definition the Galois group of W relative to Ko, and @ is isomorphic to the 
Galois group of 4; relative to Ky.° This implies that the degree of M; over 
Ky isn. 

T, will denote, for a given positive integer m, the root field of g,,(.r) 
over Ko. 

Now suppose (1) and (II) hold. We show that f(a) is solvable by 
radicals over Ko. 

Since (II) holds, primitive 1-th roots of unity exist over Ky, and there 


exists a chain of fields 


° See B. L. von der Waerden, Moderne Algebra, vol. 1, see. ed., Berlin, Julius Springer, 
1937, pp. 125-129. 


| 


A CRITERION FOR SOLVABILITY BY RADICALS. 123 


where K; is pure and of prime degree over Ki, (i= Tn = Ko, 
r= 0). Since (I) holds, H is solvable, and hence there exists a chain of fields 


where K,,; is normal and of prime degree qi over Kryi-1 (1=1,2,° -, 5) 
(If M;C K,, s=0). Since T,, and n= 0 (mod qi) (t= 1,2,---,3s), 
it follows that K, Tq, (t= 1,2,- - -,8), and hence K,,; is pure over K,,i-; 


1,2,---,8). If My then f(z) is solvable by radicals over Ko. 
If M;~ W;, then Ko is of prime characteristic p, and there exists a chain 


of fields. 


where K; =K;_,(«), «; being a root of an irreducible binomial 2? —aj, in 


((=1,2,:--,v). Let Ko = Kris, K, = Ko(a,), and in general Kj = 
(t=1,2,---,v). Then either = Ki, or is pure ‘and of 
prime degree p over (t= Therefore there exists a chain 


of fields 


where K; is pure and of prime degree over (t=1. --,r+s+t). 
Hence in this case f(x) is solvable by radicals over Ko. 

Next suppose th: at f(a) is solvable by radicals over Ko. We show that (1) 
and (II) hold. 

If n 1, it is evident that (1) and (IL) hold. Hence we shall suppose 
that n 1, and let p,, po,: - -, pr be the distinct prime factors of n. 


By our assumption, there exists a chain of fields 


where Aj = A j_,(B:). Bi being a root of an irreducible binomial x4 — b; of 
prime degree q; in Ki, +s). Let ie be those primes 
found among 41, Which are not equal to the characteristic of 
Then if m= igi.’ * * dies primitive m-th roots of unity exist over Ko. 


As is well known, 7’, is metacyclic over A’,, and hence there exists a chain of 


fields 


where Kj is normal and of prime degree over (t= Let 


Ry = K,, = K,(B,), and in general Kj = Ris(Bi) 
Then since [Ki (7 =1,2.° -,e), it follows from (1) that either 


= 


124 B. W. BREWER. 
kK; = or K; is pure and of prime degree over (i 


Hence there exists a chain of fields 


Ko=K,C K,C---C&,C K,,,C---C Ri, My, 


where K; is normal and of prime degree over Kj. (i=1,2,°--,¢+ u). 


It follows that there exists a chain of fields 


where K; is normal and of prime degree over K;_; (¢=—1,2,---,v).7 Hence 
H is solvable, and likewise G. Thus (1) holds. 

If Ko is of characteristic zero, it is well known that (11) necessarily holds. 
Hence suppose that Ko is of prime characteristic p. Since from (1), Ks > Wy). 
there exists for each pj (t= 1,2,- a qj, = pi, such that [{M;, 44,4}, 
K;,] = K;,. Moreover, since M; is separable over Ko, [{My;, Kj,-1},.Kj,]_ is 
separable over Aj,-;, and being pure over A’j,_,, cannot be of degree p over Kj,-:. 
Ilence pi ~ p 1, 7), and thus primitive n-th roots of unity exist over 
Ky. Since Kj, = Kj,-1(B;,) Kj,-.},and {M;, Kj,_,} is normal over K;,-1, 
has all of its roots in {M;, K;,-.}, a subfield of (1 = 1, 2,- -,7). 
This implies that and hence that Ky, 
where d= pre Ks} = Ks, then gn(x) is solvable by radicals 
over Ko, and the proof is complete. Hence suppose that {7n, Ks} ~ Ks. 
Then it follows from theorem 3 that [{7,, A’.}: A's] is a divisor of n. Thus, 
since Ks} is cyclic over Ks, and T,,G (t= 1,2,° there exists 
a chain of fields 


where is pure and of prime degree over But 
Tn {Tn, Ks}, and hence g,(#) is solvable by radicals over and (II) 
holds. This completes the proof of theorem 4. 

If Ay is of characteristic zero, theorem 4 reduces to the Galois criterion, 
and this criterion is equivalent to a number-theoretic condition on the index 
series of G. 

If Ky is of prime characteristic, theorem 4 is equivalent to a similar 
number theoretic condition on the index series of G. To show this, it is only 
necessary to prove the following two theorems concerning the cyclotomic 
polynomial. 


7 See H. Hasse. Hohere Algebra II, sec. ed., Walter de Gruyter & Co., 1937, theorem 
119, p. 120. 


| 

| 

© 


Ce 


\ CRITERION FOR SOLVABILITY BY RADICALS. Zo 


THEOREM 5. A necessary and sufficient condition that gn(a) be solvable 
by radicals over a field Ky of prime characteristic p, n being composite and nol 
divisible by p, is that ga(a) be solvable by radicals over Ky for every prime 
divisor d of n. 


Proof. The necessity of the conditiun follows at once from the definition 


of solvability by radicals and theorem 3. 


To show the sufficiency of the condition, let p,, pr be the distinct 
prime factors of n, and suppose that gp,(a) is solvable by radicals over Ky 
((=1,2,---,7). Then it follows that there exists a chain of fields 

where K; is pure and of prime degree over = and 
is the root field of gp,(~) over Ky ((1=1,2,:-+,7r). As in the proof of 


theorem 4, this implies that gn(2) is solvable by radicals over Ko, and hence 
the condition is sufficient. 

Let A’ be a field of prime characteristic p, and m be the absolute degree 
of the maximal absolutely algebraic subfield of A. Then by means of p and m, 
we can define a class Cp,m of primes in the following recursive manner. 

Let pi, po, * * be the set of all primes (including 1), where p, < pj; if 
i<j, and suppose that p= px. 


1. Fori <k, p, belongs to Cp,m. 
2. p does not belong to C'o,m: 


3. For i> k, let e; be the exponent to which belongs modulo 


pi, and pi,, pi, be the distinct prime factors of e;. Then ij) <i 
and if Pi; belongs to C'p,m (j= pi belongs 


to Otherwise pj does not belong to Cp,m. 
Then from theorems 1, 4, and 5, we have 


THeoreM 6. Let W be a field of prime characteristic p, and m be the 
absolute degree of the maximal absolutely algebraic subfield of K. Then a 
necessary and sufficient condition that ga(x) be solvable by radicals over Kk, 


d being a prime distinct from p, ts that d belong to the class Cp,m. 


It is now evident from theorems 5 and 6 that theorem 4 is equivalent to 


the following theorem if A’y) is of prime characteristic. 


THeoremM 7. Let K be a field of prime characteristic p, and m be the 


absolute degree of the maximal absolutely algebraic subfield of K. Then «a 


126 B. W. BREWER. 


necessary and sufficient condition that a polynomial f(a) in K be solvable by 
radicals over K is that the index series of the Galois group of f(x) relative 
to K consist of prime numbers belonging to the class Cp,m. 


We may now give a characterization of those fields of prime characteristic 
over which every cyclotomic polynomial is solvable by radicals. 


THEOREM 8. Let K be a field of prime characteristic p, and m be the 
absolute degree of the maximal absolutely algebraic subfield of K. Then a 
necessary and sufficient condition that the cyclotomic polynomial gn(x) in K, 
whose roots are the d(n) distinct primitive n-th roots of unity, be solvable by 
radicals over K, for every n £0 (mod p), ts that m be divisible by p®. 


Proof. To show the necessity of the condition, suppose that gn(a) is 
solvable by radicals over K for every n540 (mod p). Moreover, suppose that 
the exponent d of p in m is finite. We show that this last assumption leads to 
a contradiction. If k = p****—1, then k 40 (mod p), and the exponent to 
which belongs modulo is p. Hence from theorems 1 and 7, 9x (x) 
is not solvable by radicals over K. Thus we have a contradiction, and the 
condition is necessary. 

To show the sufficiency of the condition, we have only to note, in view of 
theorems 5 and 6, that if m is divisible by p™, every prime distinct from p 


belongs to the class Cp,m. 


UNIVERSITY oF MISSOURI. 


i 
( 


ACCESSIBILITY AND SEPARATION BY SIMPLE CLOSED 
CURVES.* ? 


By Epon E. Brrz. 


The accessibility theorem presented here was obtained in an attempt to 
determine whether or not a compact Peano space? M need be a simple closed 
surface if Mf is separated by each of its simple closed curves but by no pair 
of its points. From this accessibility theorem it follows immediately that in 
such a space M, if the number of complementary domains of each simple 
closed curve is finite, then every point of the boundary of a complementary 
domain of a simple closed curve is regularly accessible * from that domain. 
It follows readily that the space M described above is a simple closed surface 
if the complement of every simple closed curve of M consists of exactly two 
components. This result was obtained previously by Zippin by a different 
method. Using this accessibility theorem, Dr. D. W. Hall has extended 
Zippin’s result, proving that the above space M is a simple closed surface if 
there exists a number N such that no simple closed curve of M has more than 


N complementary domains. 
I am deeply indebted to Professor J. R. Kline for his guidance and aid 


* Received June 10, 1940. 

1 Presented to the American Mathematical Society, April, 1939. 

* A Peano space is a non-degenerate, connected, locally connected, locally compact, 
metric space. 

4 boundary point P of a point set R is said to be accessible from R if there 
exists an are having P as one endpoint and contained in R except for the point P. 
P is regularly accessible from R if for every positive number e there exists a positive 
number d such that every point of R-S(P,d) can be joined to P by an are of S(P,e) 
lying in R except for P. 

Some of the symbols used are listed here for reference. 

p(X, Y) denotes the distance from X to Y, whether XY and Y are points or point sets, 

S(P,e) denotes the set of points # such that p(P,x7) <e. 

F(P,e) denotes the set of points such that p(P,2) =e. 

S(P, e) denotes the closure of S(P,e). 

“) means “contains”; C means “is contained in.” 


F(D) denotes the set of boundary points of the point set D; that is, F(D) 


D.(M—D), where M denotes the space of which D is a subset. 
If ¢ is an are, then ¢t> denotes the same are with its endpoints deleted. If ab is 
an are from a to b, then <ab and ab> denote ab —a and ab—b respectively. 
‘Leo Zippin, “On continuous curves and the Jordan curve theorem,” American 
Journal of Mathematics, vol. 52 (1930), pp. 331-350. See Theorem 4, p. 348. 


by 
: 
tic 
he 
a 
by 
is 
it 
0 
O 
e 
| 


128 EBON E. BETZ. 


in the work on this problem, which he suggested. Dr. Dick Wick Hall has 


also aided greatly. 


Lemma 1. Jf the limit point P of the arcwise connected set R is nol 
regularly accessible from R, then there exists a positive number e such that 
to every pair of positive numbers 8 and y such that 8 Sy < « there corresponds 
a sequence {a;} of arcs of R with the following properties: 1) for each i, a is 
an arc of S(P,8) from a point of F(P,8) to a point P;; 2) {Pi} converges 
to P; and 3) no two different elements of {a;} can be joined by an are of 


R-S(P,7). 


By a theorem of G. T. Whyburn,® we can find a positive number e and a 
sequence {P;} of points of R having P as a sequential limit point and such 
that no two of these points can be joined by an are of R which is of diameter 
< 2e. Let and 7 be any two positive numbers such that » < «. Without 
loss of generality we may suppose that all of the points of {P?;} are contained 
in S(P,8). Since PF is arewise connected, we can find an are of K from /, 
to P2, which must be of diameter > 2e, and hence cannot be contained in 
S(P,«). It follows that R has a point Q not contained in S(P,e«). For each i. 
construct an are P;Q of R from P; to Y, and on that are let Q; be the first 
point from P; belonging to F(P,8). Let «; denote the subare of PiQ from 
P; to Yi. We show that the sequence {a;} fulfills the requirements of the 
lemma. That conditions 1) and 2) are fulfilled is evident. If 3) were not 
fulfilled, there would exist an are B of R-S(P,y) joining two different ares 
of {aj}, say and Then in B+ a, + we could construct an are of 
of diameter < 2¢ joining two distinct points P,, and Py, of {Pi}, contrary to 


the selection of {P;}. 


LemMA 2. If P is an endpoint of an arc t of a simple closed curve J of 
a connected space M with the property that the complement of every simple 
closed curve of M consists of not more than a finite number of components, then 
there exists a positive number 8 such thal, if <a> is an are of M- S(P,8) — J 
both of whose endpoints are in <t >, then no component of M— (J + a) has 


its boundary contained entirely in a. 


Supposing the contrary, we let 6, be any positive number and let <a, > 
be an are of M-S(P,8,) —J whose endpoints are contained in <#> such 
that there is a component FR, of M— (J + a,) whose boundary is contained 
entirely in a,. We let 8 be a positive number such that < $8,, 8 < p(P, a). 


5“ Concerning Menger regular curves,” Fundamenta Mathematicae, vol. 12 (1928), 


pp. 264-294; Theorem 3, p. 272. 


ACCESSIBILITY AND SEPARATION BY SIMPLE CLOSED CURVES. 129 


and ¢- S(P,8.) is contained in a connected subset of /—a,. Select an are 
<a2z> of M-S(P,8.) —J whose endpoints are in <¢> such that there is a 
component f, of JJ — (J + a.) whose boundary is contained entirely in a. 
Continuing inductively, we define a sequence {8;} of positive numbers, a sequence 
{ai} of arcs whose endpoints are in </>, and a sequence {#;} of sets such that, 
for each t, 1) 8 < 38i-13 2) 8: < p(P,ai1); 3) €-S(P,8:) is contained in a 
connected subset of i134) <a>CM: 8(P, J; and 5) Ri is a 
component of (J +a;) such that F(R;) Ca; From 2) and 4) it 
follows that 6) aj: a; =0 foris4&j. From 1) and 4) it follows that 7) {ai} 
converges to P. If for each t we denote by 6; the set of points of ¢ between the 
endpoints of aj, it follows from 2), 3), and 4) that 8) b:-b; =0 for i). 
From 6), 7), and 8) it follows that ¢-+ 3a; — 3b; is an are ’. From 5) 
and 6) it follows that the sets #; are all distinct. They are all components 
of the complement of the simple closed curve /’ + (J —?). This contradiction 


of our hypothesis that every simple closed curve of J/ has at most a finite 


number of complementary domains completes the proof of the lemma. 


LemMa 3. Lel M be a Peano space with the property that the comple- 
ment on M of every simple closed curve of M consists of al least two and al 
most a finite number of components. For no boundary point P of a comple- 
mentary domain D of a simple closed curve J of M does there exist a positive 
number ¢ such that lo every pair of positive numbers & and y such that& Sn < 
there corresponds a sequence {ai} of ares of D with the following properties: 
1) for each i, a is an are of S(P,8) from a point of F(P,8) to a point P;; 
2) {Pi} converges to P; and 8) no two different elements of {ai} can be 


joined by an arc of D- S(P,n). 


Suppose the contrary, that there is a boundary point P of a complementary 
domain D of a simple closed curve J of M and a positive number e such that 
to every pair of positive numbers 6 and 7 such that 6S y < ¢ there corresponds 
a sequence {a;} of arcs of D with the properties listed above. 

Let P’ be a particular point of J distinct from P. Let » be a positive 
number such that ¢ and < p(P, P’). 


Let +, Dy denote the components of IJ distinct from JD, 
where ); has a boundary point distinct from P for i= 1, 2,- + -,h’, but not 
Fort—1,2,--- let Qi be a particular boundary 


point of D; distinct from P. 

If P is not the only boundary point of D (we shall prove that it is not 
later), then, since P is a non-cut point of J and D is connected, it follows that 
P isa non-cut point of J -+ D. Furthermore, J + D is a Peano space. Hence 


9 


| 


130 EBON E. BETZ. 


we can find a positive number & such that J + D—S(P,7) is contained in a 
connected subset of J + D— S(P,%).° 

By means of lemma 2 we determine a positive number 8” such that, for / 
either arc of J from P to P’, if <a> is an are of M- S(P, 8”) —J whose end- 
points are in </>, then no component of M— (J +) has its boundary con- 
tained entirely in a. 

Let 8” be a positive number such that M-S(P,8’”) is compact. 


Let 8 be any positive number less than each of the numbers », 8. 8”, 8’”, 


t 


and p(P, > Qi). Should the number 8 as defined above fail to exist, we have 


here simply to omit the symbol &. 

With this selection of » and 8, we determine a sequence {z;} of arcs of D 
such that 1) for each i, @; is an are of S(2’,8) from a point of F(P,8) toa 
point Pi; 2) {Pi} converges to P; and 3) no two different elements of {2;} 
can be joined by an are of D- S(P?,7). The sequence {z;} will contain a con- 
vergent subsequence, and this convergent subsequence has all of the properties 
we have listed for {%;}; hence without loss of generality we may suppose {z;} 
to converge. Denote by // the sequential limit of {a;}. Since 6< &” and. 
for each i, P; C a; C S(P,8), a: (P,8) 40, and 2; is connected, it follows 
that PC C S(P,8), F(P,8) ~0, and H is connected. 

We show that 7 CJ. If not, then there is a point « of Hin D- S(P,8). 
Since M is locally connected, we can find a number é > 0 such that every point 
of M-S(2,é) can be joined to by an are of + |. 
Since z is in //, the limit set of {%;}, we can find two distinct integers m and 
n such that @,, and % meet S(z, €); let y and z be points of am-S(a.é) and 
a, respectively. Let yx and za be ares of M- p(a, J + F(P. 9) ) | 
joining y and z respectively to x. In ya + ze we can find an are f joining %,, 
and @. Since fC M- S[z, p(x,/7)], we have fC D. Since S(P,) and 
yx + 2c C S[a, p(x, F(P,7)) we have fC yr +24 C S(P,). Thus f is 
an arc of D-S(P,7) joining the two different elements 2, and a, of {2;}. 
This contradicts our choice of {a;}. We conclude that // CJ. 

Since every point of H is a limit point of » a; D and H C.J, it follows 
i=1 
We have shown that 


that every point of HT is a boundary point of D. 
This 


1: F(P,&) ~0. and thus P is not the only boundary point of D. 
establishes the existence of the number & as defined above. 


®R. L. Wilder, “The sphere in topology,” American Mathematical Society Semi- 
centennial Publications, vol. II: Semicentennial Addresses, pp. 136-184; Corollary to 


Theorem 27, p. 148. 


ACCESSIBILITY AND SEPARATION BY SIMPLE CLOSED CURVES. 131 


The set H, being a connected subset of J and containing P and a 
point of F (7,8). contains an are h from P to a point of F(P,8) such that 
SEP, 8). 

If e is any open subare of h, there exists an integer n such that, for 
i =n, a can be joined to e by an are of S(P,8) lying in D except for its 
endpoint in ¢. ‘To obtain such a number n we proceed in the following way. 
Let 2 be any point of e. Since J is locally connected, we can find a number 
€> 0 such that every point of 1/-S(a,&) can be joined to z by an arc of 
M-S[a,p(r,.U-F(P,8) +7--e)]. Since x is a point of the limit set of 
{a;}, we can find an integer n such that p(v,a:) < €fori=n. We show that 
this n suffices. For i= n we can find an are y;x of M- S[a, p(x, M- F(P, 8) 
+ J—e)]| from a point y; of #;-S(a,é) to x The subare uir; of this are 
yiv from the last point uj which belongs to a to the first point v; after 
u; which belongs to e satisfies the requirements; for ujri:J =.24, since 
C J —e)], and hence ujvi > C D since C a; CD; and also 
uit; C S(P.8) sincea C <h> C S(P,8) and C S[a, p(x, M- F(P,8)) ]. 

Let ¢,, ¢s, @;, and ey be mutually exclusive open subares of h, numbered 
in order from P. By the previous paragraph, for j = 1, 2, 3,4, we can find an 
integer n; such that, for 1 = nj, there exists an are ujjvi; joining a; to e; and 
lving in D- S(P,8) except for vj;. Let i) and i, be two different integers such 
that % =n, + nz andi, = no+ ny In + + We construct 
an are 6 spanning h. The endpoints of 6 are v;,2 and vj,4; these we rename w 
and We have wC es and zC 

For any two points a and b on h the symbol ab will denote the arc of h 
from a to b. 

We observe that h is a subare of one of the ares of J from P to 7” Hence, 
since C D- S(P, 8) while < 8”, it follows that no component of D — 
has its boundary contained entirely in @. Thus every component of D — 6 
has a boundary point in J—(w-+2z). Let G,, G.,:-+-+,G- be the com- 
ponents of )) —@ having their boundaries contained in 6+ wz. That there 
can be only a finite number of these follows from the fact that each is a com- 
ponent of the complement of the simple closed curve 6-+- wz. For i= 1, 2, 

-,7, let g; be a particular boundary point of Gj in < wz). 
Following closely the process used in obtaining 6, we determine an integer 


i, different from i) and 7, such that «;, can be joined to each of the open arcs 


r 
< wg: > and < giz> by an are lying in D- S(P,8) except for one 
i=1 i=1 
endpoint. In case r= 0, we have simply to omit the smybols I[ < wgi> and 
i=1 


II <giz>. From the sum of 2;, and two such joining ares we construct an 


132 EBON E. BETZ. 
. . . . 

are A spanning h and having its endpoints 2 and y in e.-[[ <wgi> and 
é=1 


es* IL < giz> respectively. Since each component of D—2 whose boundary 


is contained in A + (J — ay) is a component of the complement of this simple 
closed curve, it follows that the number of these is finite: let them be Z,, E., 

Since <A> CD-S(P,8) and § < 8”, it follows that every other 
component of must have a boundary point in <ay)>. 

It follows immediately from the construction process that @ and A can be 
joined to «;, and «;, respectively by ares of D- S(P,8). Also, <A> 
C D-S(P,8). The endpoints of @ and XQ are all distinct. Hence if 6 and A 
were to intersect, they would intersect in an interior point of both, and in 
their sum together with arcs of D- S(P,8) joining them to @;, and a;, we could 
construct an are of D- S(P,8) C D-S(P,y) joining and contrary to 
the construction of {a#;}. We conclude that 6-A = 0. 

Construct the simple closed curve J’ = A+ we-+ yz. Then J-J’ 
we + yz Ce,+ ey. Let K denote the component of WM which contains 
Then K must contain u;,;vi,; for 7 = 1 and 3; for if any of these ares met J’, 
it would have to meet <6 -+ <A), and hence we could construct an are of 
D-S(P,8) joining «;, to either a;, or #,. Hence A must contain e; + és, and 
hence, in turn, J —J’. In particular, K 0 P+ P+ $ Q;. It follows that 

i=l 


D;; for 1+=1,2,:--,k’, Qi is a limit point of contained in Kk, 
i=1 

and, 
No component of I —.J’ different from K can have boundary points in 

both <@> and <A>. To show this, we suppose Z to be a component of M—-/’ 

different from K having boundary points in both <@> and <A>. Then we 

can construct an are p» from a point of <@> to a point of <A> lying in L 


we have LC D, and 


P is a limit point of D; contained in Kk. 


except for its endpoints. Since K + (J —J’), 

thus» C D. Then inp a;, + 2;, and the joining ares used in constructing 6 

Hence, by the con- 


Let Q be a point 


and A we can construct an are of DP joining 2;, and @,. 

struction of {a;}, this are cannot be contained in S(P, 7). 
of »» [D—S(P,7)]. By definition of 8’, the two points P’ and Q, belonging 
to J + D—S(P,7), are contained in a connected subset of J + D — S(P, 8’) 
CJ+ D—S8(P,8) (J+D—J’CM—J’. Since PCK and QCpz 
C L, this contradicts our supposition that K and L were different components 


of J’. 


We now consider the set D—J’. Certain components of this set may 


ACCESSIBILITY AND SEPARATION BY SIMPLE CLOSED CURVES. 133 


have boundary points in both <> and <A>, and hence must belong to K, 
by the preceding paragraph; every other component of D—J’ has boundary 
points in 6+ J only or in A+ J only, and hence is a component of D— 64 


not meeting A or a component of D — A not meeting 6. We have > gi C <2zy> 
i=1 


‘ 
CK, and thus } G;C K. Every other component of D— 6, as we have 
i=1 


already seen, has a boundary point in J—wzCK, and hence those not 
meeting A are contained in kK. All components of D—A other than F, F2, 
- + +, H, have boundary points in ¢ zy > C K, and hence those which do not meet 
are contained in K. We may suppose the sets -, Hs to be numbered 
so that, for s’ properly chosen, F(E;) - (J — wz) = 0 for i= 1, 2,- - -,s’ and 
(J — wz) for + Then 

i=8'+1 


a’ 


We have shown that all of 1 —.J’ is contained in K except > Ei. Since, 


i=1 


3’ 
by hypothesis, J’ must separate M, it follows that } £; 0, and thus s’ = 1. 


i=1 
Let H=EF,. Since CA+ (J—zy) while F(E£) - (J —wz) = 0, we 
have CA+ wa yz. 

Thus for § any positive number sufficiently small, we have shown how 
to determine points w, x, y, and z distinct from each other and from P on J 
in the order Pwayz and such that the are Pwayz of J is contained in S(P, 8) : 
an are A from x to y such that <A> C D- S(P,8); and a component £ of 
D—dX such that CrA+ wae + yz. 

Let 8, be a particular positive number satisfying the conditions on 6. 
Determine 7%. Yi, 21, Ar, and such that w,, 7%, y;, and are points 
on J distinct from each other and from P in the order Pw,x,y,2, and such 
that the are Pw,x,y,2, of J is contained in S(P,8,) ; A, is an are from 2, to y; 
such that <A, > C D- S(P,8,); and £, is a component of D—., such that 
Cay + wit, + Since p(P, Ai + > 0, we can select a 
positive number such that & < p(P,A: + and 8 < $8. Corre- 
sponding to we obtain ws, 22, A2, and Then we select a positive 
number 8; such that 8 < p(P,ArA2 + weteyez2) and 8 < 48. Continuing 
inductively, we determine sequences {wi}, {xi}, {yi}, {zi}, {Ai}, {Li}, and 
(8;} such that, for each 71, 1) wi, vi, yi, and 2 are points on J distinct 
from each other and from P in the order Pwiajyizi; 2) the are Pwixiyizi 
of J is contained in S(P.8i); 3) Ati is an are from 2; to y; such that 
<rAi> C D- S(P, bi); 4) Hi is a component of D — A; such that 
CA H+ witi + yizis «and 6) 8 < p(P,&-, 


134 EBON E. BETZ. 


+ From 3) and 6) it follows that 7) =0 if 
From 3) and 5) it follows that 8) {Ai} converges to P. From 2) and 6) 
it follows that 9) =0 for i 7. It readily follows that 
oO 
+S Ai is a simple closed curve Jy. We have (A; 
i=1 i=1 i=1 
+ wiz; + yizi). Hence, from 4), Li is a complementary domain of Jo for 
every 7. It follows from 7) and 9) that #; and £; are distinct for i+ j. 
Thus J, has infinitely many complementary domains, contrary to our hy- 
pothesis. This completes the proof of Lemma 3. 


THEOREM 1. AccEssIBILITy THEOREM. Let M be a Peano space with 
the property that the complement on M of every simple closed curve of M 
consists of at least two and at most a finite number of components, and let 
D be a complementary domain of a simple closed curve J of M. Then every 
point of the boundary of D is regularly accessible from D. 


This theorem is an immediate consequence of Lemma 1 and Lemma 3. 
Suppose D has a boundary point P which is not regularly accessible from J. 
Then, by Lemma 1 there exists a positive number e such that to every pair of 
positive numbers 8 and y such that =< « there corresponds a sequence 
{a;} of arcs of D with the following properties: 1) for each 1, %; is an arc 
of S(P,8) from a point of F(P,8) to a point Pi; 2) {Pi} converges to P; 
and 3) no two different elements of {a;} can be joined by an are of D- S(P,n). 
By Lemma 3, however, this is impossible. 

This accessibility theorem is used in proving the following theorem, 
which, as has already been pointed out, is a result obtained previously by 


Zippin by a different method. 


THEOREM 2. Let M be a compact Peano space such that 1) M is nol 
separated by any pair of its points and 2) the complement on M of every 
simple closed curve of M consists of exactly two components. Then M is a 


simple closed surface. 


We suppose that there exists a compact Peano space M which satisfies the 
hypothesis of the theorem and yet fails to be a simple closed surface. Since 
M has no cut point, it must contain a simple closed curve. From a theorem 
of Zippin ? it follows that 1 must fail to satisfy the Jordan Curve Theorem : 
that is, there must exist a simple closed curve J in M such that one com- 
ponent D of M—-J has as its boundary a proper subset of J. Let ¢ be a 
minimal are of J containing the boundary of D, and let P and @ denote the 


7 Loc. cit., Theorem 3’, p. 340. 


ACCESSIBILITY AND SEPARATION BY SIMPLE CLOSED CURVES. 135 


endpoints of ¢. We denote by D’ the other component of M—J and by ?’ 
the are J — <t>. 

From the accessibility theorem it follows that both P and Q are accessible 
from D. We construct an are f from P to Q such that <f> CD. 

Suppose that f separates D: D—f = D, + D., where D, and 
neither D, nor DP), contains a limit point of the other. Consider the simple 
closed curve J; =t-+f. We have F(D,) C J;, and thus D, + D, consists 
of at least two different domains of 1J—J,. The component of IJ — J, con- 
taining D’ thus makes at least a third component of M—J,. This contradicts 
the hypothesis that every simple closed curve of M has exactly two comple- 
mentary domains. Hence ) —f must be connected. 

Since by hypothesis the two points P and Q cannot separate D from 
<t>+ D’+ <td, it follows that <¢> must contain a boundary point of D. 
Likewise, since ? and (@ cannot separate D + <¢t> from D’+ <t’>, we have 
that <¢> contains a boundary point of D’. Now consider the simple closed 
curve J, The component of which contains <¢> must 
contain D’ and also the connected set ) —f. This accounts for all of MV, 
and thus the simple closed curve J. fails to separate M. This contradiction 


of our hypothesis completes the proof of the theorem. 


HAVERFORD COLLEGE. 


A DISCRETE GROUP ARISING IN THE STUDY OF DIFFERENTIAL 
OPERATORS.* ! 


By D. RAINVILLE. 


1. Introduction. The determination of the behavior of a certain class 
of linear differential equations under the application of the Laplace integral 
transformation is considerably facilitated by the introduction of another 
operator ° o and the study of the behavior of linear differential operators when 
subjected to o. 

We redefine o for convenience of reference. Let D) =d/dz be the usual 
symbol for differentiation with respect to x and let D°® = 1, the identity operator. 
We shall be concerned with only those linear differential operators which are 


polynomials in D and x. We define o as a linear operator such that 

(1) ot* = (— 1)*D*zx", 

where & and w are non-negative integers.* By the linearity of o we mean that 

(2) o[> Dn | bso (a**D"), 

where ks, #3, are non-negative integers and the by, are any complex constants. 
Lo) 

We next define * another operator a by 
(3) Dn — Dngk 


where / and » are non-negative integers, and the requirement that « be linear 


in the sense of (2) above. When @ is applied to a linear differential operator 


* Received June 13, 1940. 

1 Presented to the Society under a different title, April 15, 1939. The author wishes 
to express his indebtedness to G. Y. Rainich and R. M. Thrall for suggestions regarding 
the presentation of this material. 

® The operator o has interesting properties some of them known to be of importance 
in applied mathematics. For the behavior of differential operators under o see FE. D. 
Rainville, “ Linear differential invariance under an operator related to the Laplace 
transformation, American Journal of Mathematics, vol, 62 (1940), pp. 391-405. For the 
precise relation of o¢ to the Laplace Transformation see Enzo Levi, “ Proprieta caratte- 
ristische della trasformazione di Laplace,” Rendiconti Accademia Lincei (6), vol. 24, 
(1936), pp. 422-426. 

8 By Dran we mean that differential operator which acting upon an indeterminate 
function F yields the k-th derivative of the product «nF. 

* For an elementary discussion of a see E, D. Rainville, “Adjoints of linear differen- 
tial operators,” American Mathematical Monthly, vol. 46 (1989), pp. 623-627. 


136 


A DISCRETE GROUP IN THE STUDY OF DIFFERENTIAL OPERATORS. 138% 


of the type under consideration, it produces the classical adjoint of that 
operator. 

Finally, in analogy to the above definitions, we introduce two other linear 
operators and p» such that 
(4) yakDn — 


where /& and » are non-negative integers. 


2. Results. Hach of the operators o, %, , », may be expressed as a 
product of the other three. Exhibiting the inverse of each, we show that 
these operators generate a non-Abelian group, which may be taken to be 
(ifo,%.w}. The elements of G may be represented uniquely in a simple form 
(Theorem 4+). We determine five sole irredundant defining relations for G. 


3. Preliminary Theorems. Let =o? = =—y° be the identity 
operator. It is convenient to use the known results of = , a? = £, and 


(6) — (— 1 ) 


We see at once that the inverses of o and @ exist. 

There exists no positive integral power of y equal to / and the same 
remark holds for ». To see this consider the differential operator B= aD. 
Now Hence for s=0. We 
can go further than this. Suppose we define the sum of g,¢«@ and gz¢@ as 
that element of G which operating on any linear differential operator A yields 
the differential operator (g,A + g.A), then we may speak of the ring generated 
by o, a, and y. In the linear set over this ring with coefficients in the complex 
domain there is no identically vanishing polynomial in y alone except that 
with all coefficients zero; that is, in this extended sense there is no counterpart 
of the relations ot —# =0 and a —F=0. 

We are able to construct the inverses of y and p» with the aid of o and «. 
or the sake of symmetry we include results unnecessary for our immediate 


purpose.® 
THEOREM 1. (a0)? = (ou)? = (ap)? = (pa)? = (oy)? = (Yo)? = 


Because of the linearity of the operators involved, we need verify the 
results only when the operators act upon 2*D", Directly from the definitions 
we see that 


5’ What is essentially part of Theorem 1 appears in L. Schlesinger, Handbuch der 
Linearen Differentialgleichungen, Leipzig, vol. 1 (1895), p. 426. 


138 EARL D. RAINVILLE. 
(7) = a(— 1)* Dia" = Dt = 


and similarly = = (—1)"2*D", ob = ap. 
Theorem 1 follows at once. We have incidentally shown that oWo and aya 

J 
are respectively the inverses of y and up. 


THEOREM 2. o = pay, «= poy, Y = pao, p= Wao. 


The proof of Theorem 2 may be made similarly to that of Theorem 1. As 
an example, using (7) we find that 


D* == pa" D* == a= 
Finally we have in the same manner 
THEOREM 3. (ay)? = (Wa)? = (on)? = (uo)? =o”. 
Part of Theorem 3 will be used as a defining relation for G. 


4. The Group G{o,a,y}. Consider first the group TJ, generated by o 
and a This is simply isomorphic with the octic group ® with defining relations 
ot = = (ac)* = H. The elements of 7’, may be taken to be a, 

From Theorem 1 we have Hence o*a = = = ao". 
Since o* is also commutative with o, we have proved that, if te 7's, then 
o*t == to’. 

We already know from Theorem 3 that yaya =o? = ayay. Then from 
the identity payay = yaya we may conclude that o*°y = yo*. Next by com- 
bining various results given alone, we have 


“oy = = = aHapac = pac. 


Thus we have proved that y is commutative with o? and with ac. 
Now let G be the group generated by o, a, and y. We shall prove 


THEOREM 4. If ge G{o,a,p}, then g may be represented in just one of 
the two forms yt or o**'t, where s = 0 and te Ts, and this representation is 
unique. 


The group 7’, contains a subgroup H generated by o? and ac. T's may be 
decomposed into T+ oH. Each element of H is commutative with y. Next. 
by Theorem 1, yoy = o*, so that if o does occur between two powers of y. 
one or both exponents of y may be brought to zero. Hence, if ge @ and g\ 7, 


® See Carmichael, Groups of Finite Order, 1937, p. 176. 


t 
y 
é 
( 
{ 


A DISCRETE GROUP IN THE STUDY OF DIFFERENTIAL OPERATORS. 139 


then g may be put in the form ¢,y*t,, where ¢,, f2¢ JT and s > 0. Again, ¢; is 
either commutative with y or is the product of o and an element commutative 
with y. We have shown the existence of the representation of Theorem 4. 

Let us turn to the question of uniqueness. Let s; be the smallest positive 
integer for which there exists an equality of the type 8,y*t, = 8.y*l. in which 
§,, 8. = F or o and ¢,, t2¢ 7's and in which not both of the equations 8, = 8: 
and = hold. Then 
(8) == == ts, 


where 6, or o and 7, and not both of 6, and hold. We 


now apply the inverse of yw and find 


(9) == 
If = then 

where 1, ~ FF. If 8; =o, 

(11) ys? 


so that in any case y*' is of the form 8,y*/, with not both 6, = EF and t, = F. 
Since was to be a minimum, this is a contradiction for s; > 1. If s,; 1. 
then from (10) or (11) we get an equation of the type y® = 1¢; where ¢; « T. 
and not both s=0 and /; =F hold. Then /;* =F but y** F, hence the 


representation of Theorem 4 is unique. 


5. The Defining Relations. Consider now an abstract group with three 
generators B,, Bs, B;, and the sole defining relations 


(12) Bi* = = (BiB2)° = = (B2B;)* 
We see that, if we put Bi =o, B; the relations of (12) are 


satisfied. The proof of Theorem 4 was based upon only the corresponding 
relations 


(13) ot = = (ca)? = = (ay)? = 


and the fact that for s > 0. Any identity involving the elements of 
@ must then be a result of (13). We have shown that @ is simply isomorphic 
with the abstract group defined above. 

In order to assure ourselves that the defining relations (12) are irredun- 
dant, we make certain other selections for B,, B2, Bs. It is easily verified that.: 
if we choose Bi =o, “oy, %o, o, 0; B.= WY, %, %, ao, ao; oy, E, oy, 
respectively, then in each of the five cases one, and only one, of the relations in 
(12) breaks down. These choices show that the relations (12) are irredundant. 


140 EARL D. RAINVILLE. 


The operators o”, a, and y generate an infinite Abelian subgroup of G, 
We call this subgroup N and note that its representative element may be 
written in the form y"h where h « H, the group of order 4 generated by o? and 
zo, and where m is an integer, positive, negative, or zero. We shall see that @ 
may be decomposed into V + oN. Indeed, from Theorem 1 we have yo = oy 
which, in view of the commutivity of y and o* becomes yo = oye". Hence 


= yoyo” = = Repeated applications of this process 
lead to 
(14) = oy 


where s is a positive integer. Theorem 4 shows that any element of G may be 
written in one of the four forms y*h, y%oh, oW*h, of*oh, with s= 0, he H. 
Using (14) we see that Yoh = and = = 
Hence and oy*ch are elements of V and and are products of 
with elements of N. 


UNIVERSITY OF MICHIGAN. 


L 


a 
a 
h 
( 
e 
u 
\ 
d 
n 
0 
b 
A 
f 
t] 
| 
Vi 
B 
P 


THE DIRICHLET PROBLEM FOR A HYPERBOLIC EQUATION.* 


By Fritz 


This paper deals with the Dirichlet problem for the hyperbolic differential 
equation 
(1) Ucy = 0, 


i.e. with the problem of determining a solution wu of (1) from given values on 
a closed curve (.' Simple examples show that the Dirichlet problem for a 
hyperbolic equation has a completely different character from that of the 


corresponding problem for an elliptic equation such as the potential equation 


(2) Ure + Uy = 0. 


In the case of a hyperbolic equation the Dirichlet problem certainly is not a 
“natural” * problem of mathematical physics, as its solution may neither 
exist, nor be uniquely determined, nor depend continuously on the data. 

However it is possible to obtain fairly general positive results in this 
connection. The general solution of (1) is well known to be of the form 
u= f(x) + 9(y); more exactly, a solution of (1) is of that form with uni- 
valued functions f and g in every region which is convex in the z- and y- 
direction, i.e. the boundary C of which is a Jordan curve, intersected in at 
most two points by every parallel to the v- or y-axis. We restrict ourselves to 
regions of this kind. 

We call the points of C, in which there is a line of support parallel to the 
z- or y-axis, the “ vertices ” of C8 It is evident, that C has either two, three. 
or four vertices, the last alternative representing the general case. The 
behavior of the Dirichlet problem for ( depends largely on that of certain 
transformations of C into itself, which suggest themselves immediately. Let 
for a given point P of C the point with the same abscissa be denoted by AP, 
the point with the same ordinate by BP, and let T denote the transformation 


* Received May 9, 1940; Revised October 17, 1940. 

‘See J. Hadamard, “ Equations aux dérivées partielles, le cas hyperbolique.” 
L’Enseignement Mathématique, vol. 35 (1936), pp. 25-29. 

“See the discussion in Courant-Hilbert, Methoden der mathematischen Physik, 
vol. IT, p. 176. 

3 Following the notation of A. Huber, “ Die erste Randwertaufgabe fiir geschlossene 
Bereiche bei der Gleichung @°2/drdy =f (a,y),”’ Monatshefte fiir Mathematik und 
Physik, vol. 39 (1932), pp. 79-100. 


141 


142 FRITZ JOHN. 


BA, i.e. TP = B(AP). The vertices of C are the fixed points of the trans- 
formations A and B. The sequence of points P, TP, T?P, T*P,: - - may be 
called the A-polygon determined by P.* If there is an n > 0 for which 7"P = 
P, we call the smallest such n the “‘ period ” of P, and P a periodic point of C. 
In his paper A. Huber seeks to reduce the Dirichlet problem for general 0 
to that for the case of a C with only two or three different vertices, and to that 
in which all points of C are of period 2. The method however does not seem 
to be applicable to all cases.° 

Huber also treats a special case, in which the points of a A-polygon are 


everywhere dense on C, namely the ellipse 
(3) x =acos lt, y = bsin (t— é), 


where €/z is irrational. Jn this case the Dirichlet problem can be solved for 
general classes of boundary values, provided € satisfies certain conditions 
(which actually are of a Diophantine character). 

In a recent paper D. G. Bourgin and R. Duffin give a complete discussion 
with the help of Fourier series of the case where (' is a rectangle with sides 
of slope + 1.° If é denotes the ratio of the sides of the rectangle, the solution 
of the Dirichlet problem is uniquely determined if and only if € is irrational; 
the solution exists for all boundary values, which are differentiable a sufficient 
number of times, if € cannot be approximated “ too rapidly ” by rationals. 

The present paper takes up the case of an arbitrary closed curve C, which 
is convex in the z- and y-direction. The main feature of the theory developed 
here is the close connection established between the Dirichlet problem for C 
and the topological properties of the transformation T' of C. 

It is known that the Dirichlet problem for the potential equation (2) 
for a curve C is closely associated with the theory of conformal mappings. As 
(2) is invariant under conformal mappings, the Dirichlet problem for C can 
be reduced to that for a circle by a conformal mapping of C on a circle; the 
existence of such a mapping for suitable curves is guaranteed by Riemann’s 
mapping theorem; for a circle again the problem can be solved explicitly either 
with the help of Fourier series or by Poisson’s formula. In a similar manner 


*See Huber, loc. cit., p. 94. 

5 See the remarks by Hadamard, loc. cit. 

®*“The Dirichlet problem for the vibrating string equation,” Bulletin of the 
American Mathematical Society, vol. 45 (1939), pp. 851-859. The results are reformu- 
lated here for our form (1) of the equation, which can be transformed into the 
equation u,.—u,,=90 by a simple rotation of axes. Also the Dirichlet problem for 


the ellipses (3) treated by Huber, can be transformed easily into that for rectangles 


with sides of slope + 1 


wad 


p 


THE DIRICHLET PROBLEM FOR A HYPERBOLIC EQUATION. 143 


the Dirichlet problem for (1) is closely connected with the mappings leaving 
(1) invariant, namely the mappings of the form 


(4) w—=f(x), y =g(y). 


In order that two curves can be mapped on each other by a transformation (4) 
their respective 7’-transformations have to be topologically equivalent. 

The main results of this paper may be summarized as follows: All Jordan 
curves C, which are convex in the z- and y- direction, belong to one of two 
classes : 

A curve C of the first class may be divided into two ares C, and C2, such 
that the values of w on C, already completely determine those on C, uniquely. 
The set of boundary functions for which the Dirichlet problem has a solution 


* dense.” 


is not even 

The curves C' of the second class are exactly those for which the A-polygon 
of every point is everywhere dense on C. For every such curve there is a )i- 
continuous mapping of C by a transformation of the form (4) on a rectangle 
with sides of slope + 1 and irrational ratio € of sides. In that case the Dirich- 
let problem is reduced to the one considered by Bourgin and Duffin. The 
ratio € is uniquely determined by C. € is an invariant of C under mappings 
of the form (4). Curves of the second class with the same € can be mapped 
on each other by a bi-continuous transformation of the form (4). For almost 
all values of € the Dirichlet problem will be solvable for all boundary functions 
which are differentiable a sufficient number of times. There are however 
special values of €, for which even analyticity of the boundary values does not 
guarantee existence of the solution. 

The following sufficient condition for uniqueness of a solution will be 
proved: If the set of periodic points of C is at most denumerable, then the 
solution of the Dirichlet problem is uniquely determined. 


1. Formulation of the Dirichlet problem. In what follows, we denote 
by C a closed, continuous, simple curve, which has with every parallel to the 


r- or y-axis at most two points in common.’ 


1.1. If u(x, y) is twice continuously differentiable in the open region B 
bounded by C and continuous in B + C, then u=f(xv) + 9(y), where f and 
q are continuous in every point of B+ C with the possible exception of those 


7The last condition corresponds roughly to the restriction to simply connected 
regions in the case of the potential equation (2). The rdle of the interior of the region 
in potential theory is played here by the rectangle formed by the characteristic lines 
of support. (See Huber, loc. cit., p. 80). 


144 FRITZ JOHN. 


points of C, through which there are two characteristic lines of support of C 
(i.e. of multiple vertices of C). 


Proof. If (a%,y) C Band (#2, y) C B, then (a, y) C B for all x with 

Hence 
Uy ¥) — Uy (a1, = (E y)dé = 0; 

thus wy(a,y) = (y) for all (a, y) C B. Similarly =w(r) tor all 
(x,y) CB. If (x,y) and (2’, y’) are in B, then there is a broken line J in 
B joining those two points, which consists of a finite number of segments 
parallel to the x- or y-axis; then 


u(x’, y’) = f u,dx + Uydy = W(é)dé + dn. 
L vy 


Consequently y) is of the form + g(y) in B, where and g(y) 
are continuous in B, as y(2°) and @(y) are continuous. If the lines of support 


of C, which are parallel to the a- or y-axis. are given by 


r= il, x= b, y=B 


respectively, then f(a) is continuous for a <a <b, and g(y) fora <y< Bp. 
From the continuity of win B+ C we can conclude easily that f(.r) and g(y) 
are continuous as well in every simple vertex (i.e. one lying on only one 
characteristic line of support.) 

In order to make possible a unified treatment of the various cases, we 
replace the Dirichlet problem by the following modified problem: Let there 
be given a function v on C. We say a function u(a,y) is a solution of the 


Dirichlet problem for the boundary values vr, if 


(a) u(x,y) =f(x) + 
where f(z) is continuous for a= a= hb and g(y) continuous for z= y= B, 
(b) u(x,y) for (a,y) on 


‘This appears reasonable in view of Theorem 1.1. Thus we drop the assumption 
of differentiability for u, which may be justified by using a suitable generalization 
to define u rw for non-differentiable functions, This is essential for the purposes of this 
paper, only in so far, as it permits us to state theorems in a more simplified manner, 
without having to add the cumbersome conditions (mostly of a rather obvious kind), 
that insure differentiability. It is essential however for all results derived here, that 
u is assumed to be continuous. The continuity of f and g in the respective closed 
intervals, postulated here, is equivalent to continuity of wu in B + C, with the possible 
exception of the cases where C has less than four vertices and one of the multiple 


vertices has the character of a cusp. 


Try) 


= 
| 
( 


C 


THE DIRICHLET PROBLEM FOR A HYPERBOLIC EQUATION. 145 


If for a function v on C the Dirichlet problem has a solution, we call v 
“regular.” <A function v on C is called “semi-regular,” if there exists a 


sequence of regular functions vy on C, such that 


(a) lim Un — VU, 
(b) lim (total variation of v—v, on C) =0.° 


Every regular, and hence also every semi-regular function v on C, is obviously 


continuous. 
Let A, B, T denote the transformations of C into itself, defined in the 


introduction. Then 

== A, a= B, T = BA, AT = BT T-*B. 
[It is obvious that the Dirichlet problem for a function v= v(P) on C con- 
sists in finding two functions f(P) and g(P), continuous on C, such that 
(5) v(P) =f(P)+9(P). f(AP) f(P), g(BP) =g(P). 
We have from (5) 
g(P) —g(TP) = 9(P) — 9 (BA P) = g(P) —g(AP) = — (AP), 
f(P) —f(TP) =v(P) —v(BP). 
Hence, by induction, for every integer n > 0 
9(P) —g(T*P) [o(T*P) — AT*P)] 


(6) n-1 
f(P) —f(1"P) ='S [v(T8P) — v(BT*P)]. 
k=0 


2. Transformations of C into itself. We shall make use of certain 
known results concerning arbitrary topological transformations 7 of a Jordan 


curve C into itself.?° 


*The total variation of a function w on C is defined as the least upper bound 
n-1 


of | w(P;) -w(P;.,)| 


| for any n points P,,- --,P, of C lying in the order in- 


n 
i=1 

dicated by the indices. 

1° The following papers may be consulted in this connection: H. Kneser, “ Regulire 
Kurvenscharen auf den Ringflichen,” Mathematische Annalen, vol. 91 (1924), pp. 135- 
154; J. Nielsen, “Om topologiske Afbildninger af en Jordankurve paa sig selv,” 
Matematiske Tidsskrift B (1928), pp. 39-46; A. Denjoy, “Courbes définies par les 
équations differentielles 4 la surface du tore,” Journal de Math. pures et appl., vol. 11 
(9th series), 1932. pp. 333-375; E. R. van Kampen, “ Topological transformations of 
a curve,” American Journal of Mathematics, vol. 57 (1934), pp. 142-152. 


10 


146 FRITZ JOHN. 


Every ordered pair of different points P, Q of C determines uniquely an 
open set, the “ arc” I = (P,Q), consisting of those points R of C, for which 
P, R, Q determine the positive sense on C. We denote with 7’P the image 
point of P under the transformation 7’, with TJ the set of imagepoints of the 
areI. T may be called even, if it preserves sense, i. e. if T7(P,Q) = (TP, TQ): 
similarly 7 will be called odd, if always T(P,Q) = (TQ, TP). P is called a 
“ periodic ” point of 7, if there is an n > 0, such that P = T"P; the smallest 
such n will be called the “ period ” of P. 

If 7’ is an even 1 © 1 continuous transformation of ( into itself, obviously 


one and only one of the following cases presents itself: 


I. All points of C are periodic (T is “ periodic ”), 

II. C contains periodic and non-periodic points is semi-periodic ”), 

Ill. No point of C is periodic; there is no P? such that the set of points 
P,TP,T?P,: - - is everywhere dense on C (T is “ intransitive”), 

IV. No point of C is periodic; for some point P the set of points P, 


TP,T°P,: is everywhere dense on C is “ transitive”). 


in case I is is easily shown, that all points have the same period n. If P 
is any point, the points P, TP,---,7"'P divide C into n non-overlapping 
ares; if J is anyone of those arcs, the other ares are given by T/, T*J,- - -, 
respectively. 

In case II all periodic points have the same period n. The set F' of 
periodic points is closed. The complementary set C—F consists of a de- 
numerable number of arcs J with endpoints in F, each one of which ares is 
fixed under 7”: we may call them “periodic ” arcs. If J is a periodic are, the 
ares I, TI,- - -.7""T are non-overlapping. If Q is a point of the periodic 
are I, T"Q is in I as well; let J, denote that are with endpoints Y and 7T"Q, 
which is contained in J; then all ares 7*J, are non-overlapping: all ares 
Tm], are contained in 7”; lim 7"*"Q exists and is the same for all Q in 

kox 
I, and is either identical with 7” or with TS, if J = (R, 8). It is evident 
that, if J is represented by a regular analytic function, there can be only a finite 
number of periodic points, unless all points are periodic. 

In case IIT let Q be any point of C, and let o be the set of limit points of 
the set of points Q,7Q,7°Q,:--. It is known, that o is a nowhere dense 


perfect set, and is independent of the choice of @.’* If J is an are of ( —o, 


11 See Denjoy, loc. cit., pp. 340-341. 
12 See Nielsen, loc. cit., p. 40-41; van Kampen, p. 145. 


an 
ich 
Age 


the 


THE DIRICHLET PROBLEM FOR A HYPERBOLIC EQUATION. 147 


all T*J are non-overlapping. If the transformation 7 is representable in the 


form s’ = f(s), using the length of arc s as parameter, and if as continuous 


and of bounded variation, the case III cannot occur.'* (Thus the transforma- 
tion 7’ of C defined in the introduction, cannot be intransitive, if the curvature 
of C varies continuously along the curve). 

In case IV 7’ is topologically equivalent to a rotation of a circle; i.e. 
there exists a real number é and a 1 <— 1 continuous, sense-preserving mapping 
(= f(P) of the points P of C on the points e?*' of the unit circle in the com- 
plex plane, such that f(7P) =?t-+ € (mod 1).% The constant é, which is 
necessarily irrational, is uniquely determined (mod 1) by 7’; € will be called 
the “ modul” of 7’. The parameter ¢ is uniquely determined on C (mod 1), 
except for an additive constant. 

In what follows we denote again by A, B, T the particular transformations 
of C into itself defined in the introduction. We shall call the curve C periodic, 
semi-periodic, intransitive, or transitive respectively, if the transformation T 
determined by C’ has that character. We observe that a curve with less than 
four vertices is necessarily semi-periodic, as multiple vertices of C are fixed 
points of T and not every point is a fixed point of T. If C is transitive, the 
modul €, uniquely determined by 7, will also be referred to as the “ modul of 
(‘”; we also agree in this case to determine the arbitrary additive constant 
of the parameter / by the condition, that = 0 is the parameter value of the 
lefthand vertex on (’; we call the parameter ¢ obtained in this way, the canonical 
parameter on C. The canonical parameter on a transitive curve C of modul & 
is characterized by the properties: a) the values of ¢ (mod 1) are in 1@ 1 
continuous correspondence with the points of C; b) the direction of increasing 
‘ corresponds to the positive sense on C; c) ¢ =0 corresponds to the left hand 


vertex on (; d) the transformation T is given by t’ =? + € (mod 1). 


2.1. If C is neither periodic nor transitive, there exists an are I, such that 


none of the arcs T*I, and ATI, has a point in common with any other one. 


Proof: T is either semi-periodic or intransitive. According to the previous 
discussion there exists in either case an are J, such that no two ares 7*J have a 
common point. Then for every point P there is at most one k such that 7*P 
isin J. As there are at most four vertices, J contains at most 4 points of the 


18 See Denjoy, loc. cit., p. 372; van Kampen, loc. cit., p. 149. 
™ Kneser, loc. cit., pp. 141-144; Nielsen, loc. cit., p. 39. This theorem is also 
related to the theorem by M. Morse and G, A. Hedlund (“Symbolic dynamics IT,” 
American Journal of Mathematics, vol. 57 (1940), pp. 17-19), that a Sturmian series 


is identical with a properly chosen mechanical sequence. 


)); 
la 
est 
sly 
its 
P 
P 
ng 
ol 
is 
ic 
cg 
in 
it 
yf 


148 FRITZ JOHN. 


form 7*Q, where Q is a vertex of C. Let now J, be a subare of J, not con- 
taining any one of these four points. No two arcs T*J, will have common points. 
If moreover T*J, and AT’, had a common point, then also J, and T-*AT'I, 
= AT*], would have points in common; as AT7**" is an odd transformation. 
and as the are J) has a common point with its image under AT**’, there would 
necessarily exist a point R in J), which is fixed under the transformation: 
R=AT**k, Ifk-+r is even, say = 2m, this implies 


R= AT™R, T*R = ATR; 
if k +r is odd, say = 2m —1, we have 
R = AT*"'f, = TAT"R BIE; 


in either case T7”R would be a vertex of C, contrary to the assumption that 
no point F# of this kind lies in J). Thus no two ares T*/, and AT'I, have a 
common point. Obviously no two arcs of the form AT*J, and AT7''/, can have 


a common point either. 


2.2. If C is periodic of period n, there exists an are Ig such that no two of 


We We 


the arcs Ip, T"*Io, Alo, ATIo,: +, AT" have a common point. 

Proof: According to the properties of 7 in case I, mentioned above, 
there is an are J, such that J, TJ,---,7T""J are without common points. 
There are at most four different points of the form T*Q in J, where Q is a 
vertex. If J, is a subarc of J not containing any of those four points, Jy satisfies 
the statement. 


2.3. Let C be transitive and of modul é. Let t be the canonical parameter of 
a point P of C. Then the points AP and BP have canonical parameter values 
—tand é—t respectively. (It follows, that the vertices of C, as fixed points 


of A and B, have canonical parameters 0, 4, 46, 46 + 4 respectively.) 


Proof: If ¢ is the canonical parameter of the variable point I’ of C, the 
parameter of AP is a continuous function (1) (mod 1); as A7’ = TA, we 


have the functional equation 


+ = $(t) —€ (mod 1). 
Hence, as ¢(0) 0, 
d(né) = n€ (mod 1) 


for all integers nm. If s is an arbitrary real number, there is a sequence of 
integers n;, such that s=lim m&€ (mod 1), as € is irrational (Ironecker’s 
k-0o 


theorem) ; it follows that ¢(s) =—s. As B=TA, we have €—! as para- 


meter of BP. 


( 
| 
t 

7 
al 

bu 

loc 
| 


l- 


THE DIRICHLET PROBLEM FOR A HYPERBOLIC EQUATION. 149 


2.4. If C; and C,. are two transitive curves with the same modul &, there is a 


transformation 


(7) =(y) 
with monotone increasing continuous functions @ and y mapping C, on C;.%° 


Proof: Let ¢ denote the canonical parameter on both C, and C2. Let 
= (2,y) be a point interior to Cy. Let (a1, 4), (%2,y), 41), (2, Y2) be 
the four points of intersection of C with parallels to the z- and y-axes through 
R. Let < yi < yo, and let ts, ts, be the canonical parameters of 
those four points. Then = &€—t,, —t,==¢,; (mod 1). Take the points 
P,, P2, P;, Ps with canonical parameters ¢,, ts, ts, fy on Cs. P, and P, have the 
same ordinate, P; and P, the same abscissa; let the image R’ of R be defined 
as the intersection of the lines P,P, and P;P,. It is easily seen that that 


mapping has the required properties. 

2.5. Hvery transitive curve of modul é can be represented in the form 
x= (cos + €)), y = (sin 2zt), 

where @ and w are monotone increasing continuous functions. 


Proof: « = cos 2x(t + €), y’ =sin 2zt is an ellipse with ¢ as canonical 
parameter and of modul €. Our curve can be mapped on this ellipse by a 
transformation of the form (7). 

2.6. Every transitive curve of modul é can be mapped by a transformation 
(7) with continuously increasing functions d(x) and p(y) ona rectangle with 
sides of slope + 1. 

Proof: Take a rectangle with sides of slope +1 and ratio of sides 
a= €/(1—é&). It is easily seen that if we take on the rectangle as parameter 
t the length of are counted from the lefthand vertex and divided by the total 
circumference, then ¢ is the canonical parameter. The rectangle is transitive 
and of modul €. Apply 2. 4.7¢ 

The solution w of the Dirichlet problem for a transitive curve C will be 


The mapping can be shown to be uniquely determined. For related questions 
see G. Lochs, “Die Jordankurve im Kurvennetz,” Abh. Math. Seminars d. Ham- 
burgischen Univ., vol. 9 (1933), pp. 134-146, and “ Eine Randwertaufgabe fiir Sech- 
seckgewebe,” ibid., pp. 260-264. 

26 A proof of the fact, that a rectangle with sides of slope + 1 and ratio of sides a 
is periodic for rational a and transitive for irrational a, is given by Bourgin and Duffin, 
loc. cit., p. 854. The statement is easily seen to be equivalent to the theorem on 
reflected rays in a square, given by Kénig and Sziisz (see Hardy and Wright, The 
Theory of Numbers, p. 366 et seq.). 


| 
a 
e 

f 

Ss 
e 


150 FRITZ JOHN. 


obtained in terms of the canonical parameter ¢. For a discussion of the ques- 
tion whether these solutions are differentiable with respect to x and y, one would 
have to see whether the length of arc s on C is a differentiable function of the 
canonical parameter ¢; s is of course a monotone continuous function of /, 
The points 7"P have the same order as the multiples of the modul €(mod 1); 
these multiples are, in the average, distributed uniformly (mod 1)1*; hence, 
if ¢; and ¢, are’the canonical parameters of two points P and Q, At = t. —/, 


= lim »( , Where »() is the number of positive integers k <n, for which 


noo 
T*P is contained in the are (P,Q); if As is the distance of P and (), 
ds As 
— =lim—. It seems to be an open question, how to guarantee the existence 


dt AtooAl 


and continuity of 77 by suitable conditions on the transformation T, i.e. by 


d 


conditions on the curve C. 


3. Uniqueness and necessary conditions for existence of solutions of 
the Dirichlet problem. 


3.1. The solution of the Dirichlet problem is uniquely determined, unless C 


contains a non-denumerable number of periodic points of T. 


Proof: Let v(P) =0 on C. According to (6) g(T"P) = g(P) for all 
Pandn. Thus, as g is continuous, g(Y) = g(P) whenever the sets of points 
P,TP,T*P,: - -and have limit-points in common. In case T 
is intransitive or transitive this is the case for any two points P,Q; hence 
g = const. on C. If T is semiperiodic, the two sets have common points when- 
ever P and @Q lie on the same are free of periodic points; thus in the semi- 
periodic case g is constant on every arc free of periodic points; if the set of 
periodic points is denumerable, it follows that const. throughout 
Then also f(P) = v—g=const. Thus uf + g is constant inside C, and 
as u = 0 on (, it follows that the solution of the Dirichlet problem determined 


by v vanishes identically. 


3.2. Unless C is transitive, there exists an arc Ig on C, such that for every 
regular or semi-regular v(P) the values of v on C—I, determine uniquely 


those on 


17 See H. Weyl, “ Ueber Gleichverteilung von Zahlen mod Eins,” IMathematische 


Annalen, vol. 77 (1916). 
18 See Hobson, The Theory of Functions of a Real Variable, 3rd ed., vol. I, p. 364. 
1?In particular this is the case (at least in our formulation of the Dirichlet 
problem), if C has only two or three vertices. See Hadamard, loc. cit., p. 26. 


th 


pe: 


f 
d 
‘ 
t] 
T 
A 
H 
fo 
fu 


THE DIRICHLET PROBLEM FOR A HYPERBOLIC EQUATION. 151 


Proof: «) Let C be periodic of period n. Then 7”? = P for all P. Then 
for regular v from (6) 
n-1 
p> [v(T*P) —v(AT*P)]=0 
for all P. The same relation follows for all semi-regular +. Let J) be the are 
determined in 2.2. Then for PC J,, v(P) is uniquely determined by the 
values of v outside Jo. 
B) Let C be semi-periodic or intransitive. Let J) be determined by 2. 1. 
Let P and Q be points of Zo. As all ares 7"/, are non-overlapping, it follows, 
that the distance of 7*P and T*Q tends towards 0, as k tends towards + 
or — «©. Hence 


lim [g(7"P) —g(T"Q) ] =lim [g(7"P) — g(T"Q)] =0. 


OC 


Thus, applying (6) to P and Q and subtracting, we obtain 


+00 
(8) (Len (T*P) en | — [en(ATFP) — vn (AT*Q)]) = 0 


k 
for any regular function v,,. Let v be semi-regular and v =0 on C—I. Let 
tm be a sequence of regular functions with lim v,, = v and 
lim (total variation of v t, on C) = 0; 


then also 
lim (total variation of v,, on C —IJ,) = 0. 


As all ares T*/) and A7’/, are non-overlapping, we have from (8) 


| 


| Um (P) tm(Q) | S (total variation of on C —IJo) 


Hence 
v(P) —v(Q)} = him | tn (2?) — tn(Q) | = 0 


MINX 


for P and Q in Jy. As v =0 outside Jy, it follows, that v= 0 in Jy as well. 


3.3. Let C be non-lransitive. Let C be referred to any continuous parameter 
s, (say OSs S11). Then there are functions v(P) represented by analytic 


functions of s (of period 1), which are not semi-regular. 


Proof: Let v be a continuously differentiable function of s, vanishing on 
('-—-J, and not identically 0 on J). There are analytic functions vm of s of 


period 1, such that lim v,, = v, lim (total variation of r—t,) =0. If every 


152 FRITZ JOHN. 


analytic function of s of period 1 were semi-regular, v would be semi-regular as 
well, contrary to 3. 2. 

We may say then, that unless C is transitive, C may be split up into two 
ares, such that the values of a regular or semi-regular function on one ar¢ 
already completely determine those on the other are. The curves C, which 
are not transitive, form the first class of curves of the introduction. 


4. The Dirichlet problem for a transitive curve. 


Theorem 2.6 reduces the Dirichlet problem for any curve of the second 
class, i.e. for any transitive curve C, to the corresponding problem for a 
rectangle, treated in the paper by Bourgin and Duffin. This section contains 
some additions to the observations made in the paper just quoted. They are 
formulated for an arbitrary transitive curve C of modul € and canonical para- 
meter ¢. We have from (5) and 2. 3.: v(t) is regular, if and only if there exist 
continuous functions f(t) and g(t) of period 1, such that 


=f(t) + g(t), f(t)—f(—t), g(t) = 
Let 


Sn (kE) — v(— ké) ]. 


k=0 


Necessary and sufficient for regularity of the continuous function v is, thal 
for every « > 0 there is a 8 = 8(e), such that whenever n and m are integers 
with |n€E—m | <8, then | Syin—Sw | <e for all N. 


Proof: a) Necessity of the condition: According to (6) and 2.: 


g(t) —g(t+ —S [v(t + be) ké) ]. 
Apply this formula to t= Né, using uniform continuity of g. 
b) Sufficiency: Let for ¢ = lim (mod 1) the value of be defined 
g(t) =—lim S,,. Then g(t) is uniquely defined and continuous, as 
Sn, — Sn, | niE — m | < 8(e). Moreover, as v is continuous 
(9) g(t-+8) —g(t) =—lim — Sm] = 0(— 8) —0(1) 


hence, by induction over n, using g(0) = 0, 


( — v(— ké) = — Saas. 


Consequently g(—t) = g(t+ 6), g(t) =g(€—t). Putting 


as 


THE DIRICHLET PROBLEM FOR A HYPERBOLIC EQUATION. 15: 


f(t) = v(t) — g(t), 

it follows from (9), that 
f(—t) = t) = o(— t) — g(t + €) v(t) — g(t) =f (2). 
Hence v(t) is regular. 
1.2. All functions v(t) = e?™'Nt for integers N are regular. 

Proof: In that case 

Syn = — sin 2anNE + 1(1 — cos 2anNE) tan(rN6E), 

and | Smsn—S» | is small for 7€é small (mod 1) [although not uniformly 
in NV]. 
1.3. All funclions v(t) with continuous derivative with respect to t are semi- 
regular. (Compare with 3. 3.) 

Proof: For every such function v there exist functions v,,(¢) of the form 
such that 


lim = v 
lim (total variation of v,—v) = 0. 


moO 


1.4. A necessary condition for regularity of v(t) is the existence of a constant 
M such that 


f v(t) sin 2amt dt | < Me, 
| 
i 0 i 
whenever m und n are integers such that | mE—n | Me. 


Proof: From (9) 


1 1 
g(t+ &) sin dl f g(t) sin dt 
0 0 
1 


u(—t) sin dl —f v(t) sin dt: 
0 0 


hence 
z 
— sin dt = (1 — cos 2xmé) f g(t) sin at 
0 


1 
+ sin(2rmé) f g(t) cos dt. 
0 
This implies 4. 4. 
Let e.g. the modul € of C be a transcendental number of Liouville, i.e. 


for every & there are p and q with 


> 

= 
vO 

re 

id 

a 

1s 
re 
st 

ul 
j 


154 FRITZ JOHN. 


é P | i 
q 


Then differentiability of v of any fixed finite order is not sufficient for regu- 


| | M 
sin 22 gt dt |< —, 
Lf, In 2x 


© sin 


larity, as from 4. 4. 


whereas 


is a function (4 —3)-times continuously differentiable and not satisfying 


that condition. 
There are values of €, for which even analyticity of v is not sufficient for 


regularity. Let e. 2. Nz, be defined by 


Ne == 3", 
4 
Then for €= 
k=0 
= ny (mod 1) 
k=i+1 
| k=i+1 t 
5) 


| sin | 


| M 
u(t) sin 2xn;t dt < 


whereas for the analytic function 


we obtain 


1 1 
sin 227 nt dt = 


Sufficient conditions for regularity in terms of Diophantine properties of 


are given by Bourgin and Duffin in the paper mentioned above. 


UNIVERSITY OF KENTUCKY. 


© sin 2a nt 
v(t) => 


ON VOLUME INTEGRAL INVARIANTS OF NON-HOLONOMIC 
DYNAMICAL SYSTEMS.* 


By J. BLACKALL. 


1. Introduction. We consider non-holonomic dynamical systems with 


n independent position coordinates Gn» Subject to the non-integrable 
constraints, 
(a) 4raGa = 9, (r= kt n). 
a=1 
We assume that the matrix (dq) ?='--* * is of rank i; that the applied forces 
n 
Qa, %=1,:--+,n, are functions of the q’s only; that the kinetic energy 


T=4 Dd tapgagg, where the ?’s are functions of the q’s only, where 
a.B=1 
ga = dqa/dt, «= +,n, and where the determinant of (tap) is of 
=1, 
rank n and > 0. We assume also that tag = tga, a, B=1,---,n. In doing 
this we lose no generality. 
If we introduce the customary, admissible transformation ga = qa; 
n 
Pa = OT /0Ga, +,n, we have gagpapp, where the g’s are 
a. B=1 
subject to the same remarks as the ¢’s. The equations of motion of the 
system are’ 
k 
dpa/dt = Qa — OT + Aradrr: 
r=1 


(A;) 
dqa/dt = OT (a=1,---,n), 


where 7’ is expressed in terms of the p’s and q’s, and where the d’s are 
Lagrange’s multipliers. A manner of determining the d’s is indicated later. 
In this discussion ¢ represents the time. The state of the system for any ¢ 
may be represented by a point in the (p,q) = Gis’ * Qn) phase 
space. The equations of motion define the motion of a particle in this space, 
i.e., if the initial position of such a particle is given we can follow its course. 
We assume all the continuity necessary for subsequent manipulations. 


2. Determination of the \’s.2 Since 


* Received January 23, 1939. 

'Cfr. P. Appell, Traité de mécanique rationnelle, vol. 2 (1904), p. 396. 

*D. C. Lewis, Jr., “On line integrals and differential equations, especially those 
of dynamics,” Bulletin of the American Mathematical Society (April, 1927), p. 320. 


155 


156 CLAIR J. BLACKALL. 


n 
dqa/dt = = > 
B=1 


n 
(a) may be written dragappp = 0, r= 1,:--,k. Differentiation of these 
a,B=1 


last equations with respect to ¢, combined with substitutions from (A,), yields 


n 0(GraJap) dqy ar 
rage r r == (), 
dt PB pt (a g (Vp za pA ) 
(r= - 5k). 


Solving these equations for the A’s we have 


(b) Ar=3 lrsypspy — Fr(q); (r=1,---,k), 


where the /’s and F’s have their necessary meanings and are functions of the 
q’s only. It is worthy of note that the /’s are independent of the applied 
forces We take 1,13, y, 
No generality is lost in doing this. 


3. Equations (A,) and equations (A.). 
= Cr, (r=—=1,---,k), 


where the c’s are constants, are first integrals of (A,). The actual motions of 
the system are confined to the (2n —/:)-dimensional manifold defined by (a). 
If we use (a) to eliminate /: of the p’s, say Pn-rs1, Pn-ki2)° °°» Pn, from (A,) 
we obtain a system of (2n—s)-differential equations which represent the 
actual motions of the system and no others. We shall put the label (A.) on 
these resulting equations. Equations (A,) represent the motions represented 
by the equations (A:) plus others which are not actually possible. Those not 
actually possible are the ones for which at least one of the c’s is #0. 


4. Volume integral invariants.* The system of differential equations 
(B) dx;/dt = *,@n), (t=1,---,n), 


defines the motion of a particle in the (7:,22,: : +,%n) space. If at time zero, 
an arbitrary n-dimensional region V, is selected, it may be considered to flow 


into a corresponding region V+ in time ¢. 


3 Hedrick and Dunkel’s translation of Goursat’s A Course in Mathematical Analysis, 
Part II of vol. II (1917), p. 85. 


n 


INTEGRAL INVARIANTS OF NON-HOLONOMIC SYSTEMS. 157 


An integral f M(a1,° +,%n)da,- day is said to be a volume 


integral invariant of (B) it fou -d&, is independent of 
Ve 
A necessary and sufficient condition for an M which is continuous and 


has continuous first partial derivatives is that 


= (), 
i=l 02; 


Some questions concerning (A,) and (Az) are now appropriate. Do 
such systems always have a non-trivial Jf? Do they never have a non-trivial 
M? Or is the answer somewhere between ? 

Some answers to questions of this sort are given in the subsequent dis- 


cussion. 
5. On volume integral invariants for (A,). 
THEOREM. A necessary and sufficient condition for the existence of an 


invariant integral of the type - f djndqi: dqn 
for system (A,), where M is different from zero and continuous with con- 
tinuous first partial derivatives, is that > S dralrestiode be an exact dif- 
ferential [= — d log M]. 


Proof. Assuming M > 0 (which causes no loss in generality) let us first 


write (C) of § 4 in the form 


n OM n OX; 
+ 
Oa, = OX; 
which states that 
d log M nan; 
dt ia On; 
For equations (A,) this reads 
n k Ay 
diog M Sara On 
dt a-1 Oa 
From 2 
Mr, 
> (pam +, d,- +, a), 
6=1 
From 4 1 
dqo 
I dt ( 


158 CLAIR J. BLACKALL. 


Hence we have 


d log M dqu 
dt 1 ralrsaloa dt 


This equation may be written 


n Alor M 


o=1 a,6=1 r=1 


which, due to the arbitrariness of the q¢’s, requires that 


dlog M 
Oda a,d=1 r=1 
and so 
n k 
Dd = — d log M 


a.6,g=1 r=1 


which proves the theorem. 
It is an interesting fact that this condition is independent of the applied 


forces. 
6. A useful lemma.* The following is a useful lemma. 


Hypotheses. The functions 


* stad, -,k,k <n) 
>0 


are continuous, have continuous partial derivatives of first and second order, 


and satisfy the following conditions: 


O(MX; 
= () whenever all the f, —0, s—1,---,k. 

i=1 02; 

(2) =2X,;—0 whenever all the f,—0, 

i=1 

(2a) — —— \; }=0 whenever all the fy —0, B, s’ =1,° °°, 

j=1 OF; 

(3) the rank of | k is k in a region to which we confine 


our attention. 


‘The writer’s knowledge of this lemma came from a letter written to him by Dr. 
D. C. Lewis, Jr., of Cornell University. Cfr. Appell, Mécanique rationnelle, vol. 2 


(1904), p. 444. 


INTEGRAL INVARIANTS OF NON-HOLONOMIC SYSTEMS. 159 


Conclusions I. A solution of the differential equations 
(4) —— Xi, (t—=1,---,n), 


which satisfies fe s=1,- - initially, must always satisfy this 


condition. 


II. The flow on the manifold f,—=0, s=1,---,k, admits in R a 
volume integral invariant, whose integrand never vanishes and is continuous 


with continuous first partial derivatives. 


Proof. Choose any functions fii +, Un), 


Ln) 
does not vanish near an arbitrary point P of the manifold fs = 0, s =1,---,hk. 
The equations 
(5) us = (4, ° (1 == 1,- -,n), 


then determine an admissible transformation with inverse 
(6) == (th, tn), (¢=<=1,- - -,n). 


valid in an n-dimensional neighborhood of P.’ 
If we transform system (4) with the help of these equations, we can write 


our differential system as follows: 


where 
+ 
Xa, (s=1,-- -,k), 
a 1 OXq 
which in accordance with (2) vanishes whenever wu, = U2 =U = 0. 
Hence any solution of 
du 
( ) dt i ( K+1 n 


> Remarks. Condition (1) is fulfilled if (4) has a volume integral invariant. 
Conditions (2) and (2a) are fulfilled if the f’s are first integrals of (4). 


160 CLAIR J. BLACKALL. 


when joined to = 0 constitutes a solution of (7) lying 
on the manifold The existence and uniqueness 
theorems for systems of differential equations * show that these are all the 
solutions of (7) or (4) lying on this manifold initially. Thus I has now been 
proved (on the basis of hypotheses (2) and (3) only). 

Let V be any n-dimensional region in the neighborhood of P. 

Let V+ be the n-dimensional region into which the region V is carried 
after a time ¢ according to the law of flow defined by the differential equa- 
tions (4) or (7). 


Let J(t) = f dl M dx,---,da». Then 
Ve 


Let 


Then by the laws for transformation of multiple integrals 


V't 


where V’; is the map of V;. Hence 


Comparing (9) and (10) and taking into account the arbitrariness of V 


we can readily show that 


~ 0x; i » Zn) 


i=1 


Now, by hypothesis, the left hand side of (11) vanishes on the manifold 
fe—=0,s—1,:--+,k. Hence the right hand side also vanishes on this mani- 
fold; i.e., when =u, =U; =* So, on this manifold, 


0(NU;) 


=== 0. 


(12) 

From hypothesis (2) we know that U,;=0O on the manifold f; = 0. 
s=1,:--,k. From hypothesis (2a) we know that 0U;/du,—=0 on this 
manifold. These equations hold for s=1,-- -,k. 


°George D. Birkhoff, “ Dynamical systems,” American Mathematical Society Col- 
loquium Publications, vol. 9 (1927). 


a 
it 

i] 

Ir 

al 

(; 
(; 

D 

d 

a 


INTEGRAL INVARIANTS OF NON-HOLONOMIC SYSTEMS. 161 


Hence, on the manifold, we have 


9(NUs) 
Oui 


Hence the system (8) admits the integral invariant 


at least in the neighborhood of P. 

This chain of remarks suffices to establish the conclusions of the lemma, 
inasmuch as it is certainly possible to embed the manifold f, = 0,s=1,---+,k 
in a set of neighborhoods of points like P. 

This lemma enables us to deduce the existence of a volume integral 
invariant for (A) when the existence has been demonstrated for ('A,). It 
also enables us to deduce the non-existence of a volume integral invariant for 


(A,) when the non-existence has been demonstrated for (A.). 


7. On volume integral invariants for (A,). 


THEOREM. Not all systems (Az) have an MAO and analytic in the 
neighborhood of a selected equilibrium point, t.e., a point for which all the 
right members of (Az) vanish. 

Proof. One example will prove this statement. Let us_ take 
T = + 3427 + 495? and let the applied forces Qg = 0, =1,2,3. For 
(a) let us take = 0. In this example the equations (A,) are 


dp, dps dps dq; dq: dq: 
—— = — = (), — = — == Yo 


where 
1+ 


(a) may be written q29:~1— ps =0. If we use this equation to eliminate 
p; from (A,) we have the equations of motion (A,) for this example. 


These are 


The origin (0,0,0,0,0) will serve as the equilibrium point. 
Let us assume the theorem false. Consider If > 0 (which is no loss in 


1] 


| 
|_| 
| 
S 


162 CLAIR J. BLACKALL. 


generality) and let W=log M. Expanding W in ascending powers of p, 
and we have 

W=W,+ Wip, + Wopo+:: 
where the W’s with subscripts are functions of the q’s only. W satisfies the 
equation dW/dt which is 


OW ow 


1 + 
Equation (d) may be written 


+ terms of higher degree in the p’s=0. 
This, then, requires that 


OW, OW, 


1 + 91°92" 

in the neighborhood of (0, 0,0, 0,0). 
From the second of these equations we see that Wy = 3 log (1 + 41°42”) 
+ 2(4:, 93) where Q is arbitrary. Substituting in the first equation we have 


= 4:92" 
0 
1 + 205 2 0q3 


which requires that 


Since the coefficients of the powers of g. must vanish, it follows that 
02/0q, = 02/0q; = 0 and hence — qig2* = 0 in the neighborhood of 
(0,0,0,0,0). Since this is not possible, the theorem is proved. 

8. On volume integral invariants for (A;). 


THEOREM. Not all systems (A,) have an M0 and analytic in the 


neighborhood of an equilibrium point. 


where 


INTEGRAL INVARIANTS OF NON-HOLONOMIC SYSTEMS. 163 


Proof. The theorem of {7 and the lemma of § 6 suffice to prove this 


theorem. 


§. On equations (A,) having non-trivial M’s. If the 2’s are inde- 


pendent of the p’s the equations (A,) admit dyn dqi: dn 


as a volume integral invariant. The system afforded by a sphere rolling 
without sliding on a plane is a system of this type. 
The following is an example of a system having a non-trivial M. Let 


T = 39° + 302? + 343°, G1 = Q2 = 0, Qs = + sin qs and (a) be 
sin — = 0. In this example 


__ COS YsPiPs 


1 + sin? qs 
and the equations (A,) are 
dp, Sin qs qs dp: __ COSQs~ips_ 
dt 1 + sin? PiPs) “ay 1+sin?q,’ dt + sin 


is a volume integral invariant for this example of equations (A,). 


The origin O is taken at the center of the hollow sphere. The rectangular 
«, y, and 2 axes are fixed in space and taken as indicated. The position 
of the center of the rolling sphere is given by the spherical coérdinates 
(A —a, a, B) where A—da is the constant radius vector, a is 
the angle formed by the z axis and the radius vector, and B is 
the angle formed by the xz plane and the plane 
determined by the z axis and the radius vector. 


O 
i 
Fig. 1. 


164 CLAIR J. BLACKALL. 


9 


Fig. 2. 


The orientation of the rolling sphere about its center is given by Eulerian 
angles taken as indicated in Fig. 2. The rectangular a, y, and 2 axes are, 
as stated above, fixed in space, and the rectangular a’, y’, and 2’ axes 
are fixed in the rolling sphere with origin O’ at the center of the 
rolling sphere. By translation and no rotation we put the 
origin O’ at O. We now have Fig. 2. WN is the line of 
nodes in which the a’y’ plane of Fig. 2 cuts the 
ry plane. y, ¢, 6 are our Eulerian angles. 


10. Another example: We consider here the case of a homogeneous 
sphere of radius (a) rolling without sliding in a hollow sphere of inner 
radius, (A). 

Let = y — sin 6 cos 

Q, = 6 cos y+ sin sin y 
+ ¢ cos 6. 


For the constraint equations we notice that: 


T. Ag=—aO,sin B+ aQ,cos B and 
II. ABsin = —aQ, cos B cos —Q, sin B cos + sin 2. 


These equations may be put in the form: 


I. A& = a6 cos (B+y)+ ad sin + w) 


II. AB sin a —aésin (B+ wy) cosa 
+ ad[cos (B + sin 6 cos + cos a] + sin 


8 
% 
\o 
y 
N 
| 


INTEGRAL INVARIANTS OF NON-HOLONOMIC SYSTEMS. 165 
The kinetic energy 


—a)?a? + (A —a) sin’ a + 416° + + cos 6 + Ly? 
where (m) is the mass of the rolling sphere and J is its moment of inertia 
with respect to a diameter. 

Let us call q1, B= qo, G3. 6 = Ga, 5. Then the constraint 


equations are: 
A, 
I. — — Gs cos (G2 + Gs) — 8in Sin (G2 + 95) = 0 
a 


IT. jp sin + sin (G2 + Gs) COs 
a 
— + gs) sin gz cos + cos gz Sin gi] — qs Sin g; = 0 


H => (A—a)*.? + (A —a)*q.? sin? g, + + 
+ cos Ys + + U (G15 Ya» 9s) 


where U is the potential energy. 


pi = m(A po = m(A G13 ps = 

Ps = 1G, + cos qs; ps =1qs cos qs + 1Gs 
sO: 

CSC? Gy ‘ 3 

— Ps COS Jz + Ps 

I sin® 


Ps: — Ps COS 


T sin? 


is = 


We may now write 


po* + 3 7 Ps +3 T 


*m(A—a)? m(A—a)?* 
CSC? COS Js CSC? 


The equations of motion (A,) are 


lp sc? q, cot A 0U 

dt m(A —a) a dt dq a 

dps _ esc? cot , esc? qs cot Qs esc? qs [1 -+ cos? qs] 


3 


166 CLAIR J. BLACKALL. 


=— — A, sin sin(q2 + 9s)—- Az[Ccos(g2 + gs) sin gs COs + COS q; sin q,] 
— sin q; 


and the five equations for the (q’s) in terms of the (p’s) and (q’s). ‘The 


constraint equations may be written 


am(A—a)?* I 
A ese qi sin (q2 qi as) CSC Jz COS G1 


(g2 + Ys) CSC COS Jz COS Ji — SIN 
ve 


am(A—a)? 


Differentiation of I with respect to ¢ and appropriate substitution yields 


A E sc* q; cot Qi.» cos(q2 + 4s) [= Jz Cot Gz 
am(A—a)? I I 


m(A—-a)? a 
+- ps? — * gal Ps — Ax CoS(qz + qs) 


+ Az, sin(g2 + qs) cos 4s) 
-[—A, sin gs sin(gs + gs) —A2{cos(gs + qs) sin gs COS qi 
sin(qz + qs) Pz qu 


+ cos sin g:} + Az cos sin + - 7 Ps —a)? 


— Ps COS 4s + Ps SC qs Cot qs sin (q2 + Qs] CSC Jz COS(J2 Yo) 
I sin? qs I 


Pp» ese? — ps COS Jz + Ps sin (q2 + qs) 
m(A — ay? sin? [Ps — Ps gs] — Pl 


-—a function (41, G25 V4: Ys) = 9. 


From this 


A esc? qi COT G1 sin (q» Ys) ese? 


CSC Jz CSC? COS(Gz + Ys) CSC Jz COS Js G1 COS(G2 + Ys) 
Im(A—a)? Im(A —a)? 


P2Ps 


+ a function (91, 4 


Differentiation of II with respect to ¢ and appropriate substitution yields 


| 
| 
— 


INTEGRAL INVARIANTS OF NON-HOLONOMIC SYSTEMS. 167 


A ese 4 sing, + qs) COS qx Jz cot », 


gm(A—a)* I i 
sc? q, cot ds, galt 
4. Ys esc? 9s] Ax €08(J2 + 9s) 


+ A» sin(g2 + qs) cos | — c08( [—A, sin sin(q2 + 4s) 


— Az{cos(q2 + Js) sin Ys COS J; + COs gz Sin qi} | 


COS Ja CSC Js + Js) COS — Sin qi 
258 1 


A ese q, cot 9; sin(q2 + qs) sin q; 
am?(A —a)! Im(A —a)? Pips 
4. C08 (q2 + Js) COS [ Ps COS Ys + 
Ps | m(A—a)? | I sin? q3 
sin(g2 + gs) cse gz Cos ps esc? G3 + 
) 
Ps m(A—a)? I sin? qs 
+ qs) CSC COt Fs COS G1 cos(q2 + Ys) Js SIN qi 
+ Im(A —a)? 
sin(q2 + gs) CSC Js COS Js COS [ 
I *Lm(A—a)? 
— Ps 608 + cos (q2 + qs) G3 COS 
I sin? qs ia 
cos(q2 + Ys) cot sin q; 
Im(A —a)* ~Im(A —a)? 
) 
— a function of (41. G2, Js) =O 
Da) From this 
1 A ese cot qi sin(gz + gs) sin 
2= 4 3 
—a)? ae am?(A —a)! Im(A —a)? 
C08 (q2 qs ) ese Ys sin 1 cos (q2 qs ) cot VE sin qi + cos 
Im(A —a)? Pips“ Im(A —a)? PiPs 


cos(qd2 + Js) cos q; esc? qy sin (qz + Ys) ese Ja COS qi CSC? 41 


Im(A— a)? Im(A—a)? 


SiN (G2 + Ys) CSC Js COS J, 


Im(A—a)? 


+ a function of as Ys)- 


| 


168 CLAIR J. BLACKALL. 


The expression > ss for this example is 


1 OX; 
ng [= ese cot qi 


[= am? (A —a)* 
a’m(A —a)* 
€08(qz + qs) CO8 sin(q2 qs) Js COS qi 
Im(A — a)? Im(A —a)? P 
sin (gz + qs) CSC Gz COS Gz COS Gi CSC? 
Im(A —a)? ps | — cos (q2 + qs) 
[— sin ps | + sin(qz + qs) cos 


ES +45) sing: + gs) cos qr 
Im(A—a)? Ps Im(A —a)? 


4 


CSC Jz CSC? J, COS( G2 + qs) 


— [cos(q: + qs) sin gs cos + COs qs sin qi | 
[ cos(g2 + Ys) ese gz sin qi sin (gz + gs) ese Js Cos qi ] 


Im(A —a)? Im(A —ay? 
+ qs) cot gs sin + cos 
—sIn 
Im(A —a)? Pi 
sin(gz + qs) cot gs COs gi esc” 
Im(A —a)? Pz 


This expression may be written 


A 1 208 2 5 
1 cot [ 1 __ (q2 + qs) p 
am(A ) 


[ am(A—a)* A—a)? 7 3 
a®m(A—a)*? TI 
Ys + Ys — pz COS Ys ) |. 


The quantity in brackets is exactly the left member of constraint equa- 
tion I and, therefore, vanishes on the 8-dimensional manifold defined by 
I and II and on the 9-dimensional manifold defined by I. The reduced (A2) 
system would, therefore, have a non-trivial volume integral invariant. This 
example is interesting because it illustrates the use of the lemma of paragraph 6. 


COLLEGE oF St. THOMAS, 
St. Paut, MINNESOTA. 


ON THE LAW OF THE ITERATED LOGARITHM.* 


By PHitiep HARTMAN and AUREL WINTNER. 


1. Let 2 (¢),22(t),: be an infinite sequence of real-valued inde- 
pendent functions ' of class (Z*) on the interval 0O=¢1. Suppose that 
the expected value of z,(¢) +--+ +--+ 2,(t) vanishes for every n, and that 


its standard deviation becomes infinite as n— oc ; in other words, that 


1 
0 


and that, as n— o, 
1 
(2) Bromo, where 0, and [2n(t) ]?dt. 
0 
Kolmogoroff’s law of the iterated logarithm ? states that 


(3) 


log log = 1 for almost all 


provided that every z,(¢) is a bounded function and its bound is subjected 


to the limitation 


B 

(4) b. | 2n(t)| =o as n> @; 
log log By 

(it is understood that ¢-sets of measure zero may be neglected). 

The literature * does not appear to contain a criterion which goes beyond 
this boundedness condition of Kolmogoroff; a condition which unfortunately 
is not satisfied in case of most of the sequences {Zn(¢) } which occur in standard 
applications. In order to see this, it is sufficient to consider the simplest 
possible case, that in which the distribution function of z,(¢) is independent 
of n and has, in accordance with (1) and (2), a vanishing first moment and 
a finite non-vanishing second moment. Clearly, condition (4) then is satis- 

* Received July 28, 1940. 

1 The independence of the functions or of “ random variables ’ 
in which it. was always used in the analytic theory of probability; cf. e.g., A. Kol- 
mogoroff, Grundbegriffe der Wahrscheinlichkeitsrechnung, Berlin (1933), p. 150. 

2A. Kolmogoroff, ‘“ Uber das Gesetz des iterierten Logarithmus,”’ Mathematische 
Annalen, vol. 101 (1929), pp. 126-135. 

3 Cf. P. Lévy, Théorie de Vaddition des variables aléatoires, Paris (1937), pp. 258- 
289, where, in particular, Lévy’s preceding investigation is presented (“La loi forte 
des grands nombres pour les variables aléatoires enchainées,” Journal de Mathématique, 
ser. 9, vol. 15 (1936), pp. 11-24, more particularly pp. 15-24). 


’ 


is meant in the sense 


169 


170 PHILIP HARTMAN AND AUREL WINTNER. 


fied only if the common distribution function of the z,(¢) is constant outside 
a finite interval; so that the theorem breaks down even in the simplest cases 
to which the classical limit theorems of the theory of probability are applicable. 

Nevertheless, one would expect that a law so fundamental as (3) is valid 
in this case of identical distributions, and even in the case of nearly identical 
distributions. But as far as the existing literature seems to go, it is essential 
to assume, not only that the distribution function of z,(¢) be constant outside 
a finite interval (which may depend on m), but also that the length of this 
interval be 


(4 bis) 0 as n> 


An example has even been constructed * to show this o-condition to be so 
essential, that the mere passage from o to O is capable of destroying the law 
of the iterated logarithm. 

2. We shall, however, prove that the above conjecture as to the un- 
restricted validity of the law of the iterated logarithm in case of unbounded 
but equal, or nearly equal, distributions is nevertheless correct. In fact, the 


situation which occurs in the cases mentioned in §1 is taken care of by the 
theorem to be proved, which may be formulated as follows: 


Suppose that the first and second moments of the distribution functions 
of the independent functions 


satisfy the conditions 
+00 


(5) f x don(x) =0 


and 


(6) Bn/n > const. > 0, where Bn =yi + yn and yn = f 2*don(z)< 


and that the on(x) possess a dominant in the following sense: There c.risis 
a distribution function, r(x), which has a second moment 


* J. Marcinkiewiez and A. Zygmund, “ Remarque sur la loi du logarithme itéré,” 
Fundamenta Mathematicae, vol. 29 (1937), pp. 215-222. 


+00 


ON THE LAW OF THE ITERA'LEv LOGARITHM. 171 


+00 


(7) f < 


-0O 
and is such that 


(8) f don (x) = O( f dr(z)), r—> 0, 


|o|2r 


holds uniformly in n. Then 


(9) _ (2Bn log log Bn)? 


(and so, since x,(t) may be replaced by — z(t), 


log log Bn)t 1 for almost all t). 


The réle of the assumptions (6) and (8) is to impose on the distributions 


= 1 for almost all ¢ 


(9 bis) lim 


n->CO 


o, a uniform lower and upper estimate, respectively (in fact, (6) is certainly 
satisfied if 0 < const. << yn << 0). 


3. It is easy to see by a partial integration of 
—p(@)}, 


le|=2r 


that if o(2) and p(x) are monotone non-decreasing functions for which 


do(x) = f dp(x) 


|o|2r 
holds for every r > const., and if f(z) is any even, positive, continuous func- 


tion which does not decrease when | 2 | increases, then 


f f(x)do(2) f f(x) dp(z) 


le|2r lel=r 


(S o) also holds for every r > const. On applying this remark to 
p=T; f(z) =| |’, where v = 1, 2, 


one sees that the assumptions (7) and (8) imply the estimates 


(10,) f | don(a) f | x | dr(t)), r—> 00, 


al2r 
(10,) f = O( f xdr(t)), r—> ©, 


uniformly for all n. 


e 4 
| 
) 


172 PHILIP HARTMAN AND AUREL WINTNER. 


On the other hand, it is clear that there exists, for every distribution 
function + which has a finite second moment, another distribution function, 
7*. which has a finite second moment and satisfies 


‘ 


| dr(xr) = f la|dr*(x)) and f vdr(x) =o0( ff 


2|2r le|Zr 
as r—> 0. Hence, on writing 7 for 7*, one sees that, without violating (7), 
the O of (10,)—(10.) may be replaced by o. 

Accordingly, there exists a function ¢ = ¢(r), 0 <r< , which tends 


to 0 as r— o and is such that, for every n and 1, 


Obviously, ¢(7") may be chosen to be a decreasing function; so that 
(11) 0—¢(+ ~) << o(11) < © for > 0. 
Hence, one can construct a positive decreasing function e—e«(r),0<r< om, 
which satisfies the conditions 

(12,) >¢(r'”); (122) e(r) log 
and 


(13) e(r) 90 as r> 


and is such that the function 

14 A(r) = —— ] is monotone increasing, 

(7) log log r (7) 
if const. r< Then, by (122), 

(15) A(r) > rs, 
Clearly, one can assume that A(r) is defined for 0 < rS const. in such a way 
that A(7) is monotone and satisfies (15) for 0 r< o. 

4. In terms of the given sequence, {v,(¢)}, of independent functions, 
define on the same ft-interval, 0 = 1, another sequence, {2,(¢)}, of in- 
dependent functions, by placing 
(16 (1) if | a,(t)| SA(n), 

= 
) —, , if | >A(n), 


where a, denotes the number 


(17) f tdon(x). 


r\=\(n) 


ion 
on, 


ON THE LAW OF THE ITERATED LOGARITHM. 173 


It will be shown that these functions z,(¢) satisfy the three conditions 
(1), (2), (4) of Kolmogoroff for the validity of (3). 

The relation (1) is clear from (16) and (17). 

As to the second moment of the distribution function of 2 (¢), it is 


similarly seen that 


ja|SA(n) 

and so, according to (5) and the definition of y, in (6), 

1 
(18) [ 2n(t) = yn — f x*don(x) — ( f rdon(z) )?. 

But the Schwarz inequality and (10.2) imply that 
(191) ( f tdon(x))?S f = O( f a*dr(a)), (n> 0); 

lz! lel >A(n) jal >A(n) 

while (7) and (15) show that 
(192) f xv*dr(x) > 0, (n— «). 


lz] >A(n) 
Hence, from (18), 


1 
f [ 2n(t) ]?dt — yn 0. 
0 
Since this relation, when compared with (6), implies that 
(20) Bn as ©, 
condition (2) is now verified. 


Finally, it is clear from (17) that 


lu.b. | 2n(t)| SA(n) + | f adon(x)| = O(A(n)), 
le] >A(n) 


by (19;)-(192). Hence, from (14) and (13), 


n 
| =o log -) 


It follows, therefore, from (6) and (20) that (4) is satisfied. 
This completes the proof of (3) for the functions (16). 


5. In order to pass from (3) to (9), it will first be shown that if pp 


denotes the non-negative number 


ds 


PHILIP HARTMAN AND AUREL WINTNER. 


~2 
He 


(21) pn = f tdon(2), 


lal >A(n) 
then 
oO 
Oc | 


(notice that (log log /)3 is real (> 0) only if k= 16). 


It is clear from (21) and (10, bis) that 


pa | 2 | dr(z). 


>A(k) 
Hence, if Z(n) is an abbreviation for the positive function 
(23) L(n) = (n log log n)*, (n= 16), 


n 
the partial sum = of (22) is majorized by the expression 
k=16 


de(2), 
>A(k) 
which, after partial summation, appears in the form 


m-1 k 
Kk=16 j=16 
A(k) < |a|SA(K+1) 
lz] >A(n) 
Accordingly, the proof of (22) will be complete if one shows that the function 
(24) of n is O(1), asn— o. 
To this end, notice first that, by (15) and (11), 
< p(k). 
Hence, by (12,). 
< e(k). 


Since A(r) is monotone increasing |cf. (14) ], it follows that 


<k 


kiB<je 
Consequently, by (23) and (14), 


< 


B 


ON THE LAW OF THE ITERATED LOGARITHM. 175 


Qn the other hand, (11) and (23) imply that 
16j<ki/s 


Since (15) shows that k’/* = O(A(k)), it follows by addition of the 
last two relations that 


L(j)6(A(j)) = O(a(k)) + = O(A(K)). 


j=16 


Hence, the function (24) of n is 


O(A(k)) f | a | dr(x) + O(A(n)) f | | dr(a) 


j=16 
N(k) < >A(n) 


=(0( > f x*dr(x)) + O(A(n) f | | dr(x)) 


j-16 
\(k) < |w|SSA(K+1) lal >A(n) 


= f + O( f 2?dr(2)) 
riSd(n) >A(n) 
= 0(1) + 0(1), by (7). 


This proves (22). 
6. In order to write (22) in the form in which it will be needed in the 
proof of (9), define for 0 ¢=1 a function yp(t) as follows: 


§ 0, if | an(t)| SA(n), 


Then, since o,(a) denotes the distribution function of 7,(¢), 


(26) yn(t)| f | a | don(a). 


>A(n) 
Hence, from (21) and (22), 


1 
f, | ae 
0 


(k log log 


(27) 
It follows, therefore, from Fatou’s inequality that 


| y(t)! 
(& log log k) 


(28) ;< © for almost all ¢. 


But a partial summation shows that if {a,} is any sequence for which the series 


(& log log 


176 PHILIP HARTMAN AND AUREL WINTNER. 


is convergent, then 
a, +: -+ ad, =0(n log log n)3. 


Hence, (28) and (27), respectively, imply that 


(29) +: + yn(t) =0(n log log for almost all ¢ 
and 


1 1 
(30) lyi(t)|dt+---+ | yn(t)| dt = o(n log log n)4, 


7. The proof of the theorem announced in § 2 is now immediate. 
In fact, it is clear from (16), (25) and (17), (26), respectively, that 


a1 
| + 2n(t) —an(t)| S| yn(t)| and | a, |= | yn(t)| dt. 
It follows, therefore, from (29) and (30) that 
z,(t) —2,(t) —an(t) = 0(n log log for almost all ¢. 
But n log log n = O(B, log log B,)*, by (6). Consequently, 
2i(t) +° 2n(t) =2,(t) + 2n(t) + 0(Bn log log B,) for almost all 


Hence, (9) follows from (3) and (20). 


QUEENS COLLEGE, 
THE JOHNS HopkKINS UNIVERSITY. 


= 
7 


ll 


A NEW DERIVATION OF THE EQUATIONS FOR THE 
DEFORMATION OF ELASTIC SHELLS.* 


By Eric REIssner. 


1. Introduction. The equations for the deformation of thin shells have 
first been established by A. E. H. Love [1] who thereby corrected and com- 
pleted previous attempts by Aron [2] and Mathieu [3]. A reproduction of 
Love’s work is found in his Treatise on the Mathematical Theory of Elasticity. 
The problem has been reexamined repeatedly. In this connection reference is 
made to the work of Krauss [4], Trefftz [5] and Odquist [6]. 

In the present paper an attempt is made to present the part of the theory 
concerned with small displacements in as simple a way as possible. -In that 
respect two results in the following developments may be mentioned as sig- 
nificant. One is an elucidation of the reason why it is of special advantage to 
choose on the middle surface of the shell the lines of curvature as parametric 
curves. The other is a modified derivation of the stress-strain relations which 
utilizes directly the known expressions for the strain components with respect 
to orthogonal systems of coordinates and the assumption that the displacement 
components vary linearly with the distance along the normal from the middle 


surface of the shell. 


2. Basic Assumptions of the Theory. The following assumptions are 
made: 

1° The thickness of the shell is small compared with the radii of curva- 
ture of its middle surface. 

2° The stress components normal to the middle surface are small com- 
pared with the other stress components and may be neglected in the stress- 
strain relations. 

3° The normals of the undeformed middle surface are deformed into the 
normals of the deformed middle surface. 

4° The displacements are so small that the equilibrium conditions for 
deformed elements are the same as if the elements were not deformed. 


It should be said that 2° can be considered as a consequence of 1°, which 


* Received August 11, 1940; Revised October 14, 1940. Presented to the American 
Mathematical Society, December 30, 1940. 


12 


dig 


178 ERIC REISSNER. 


is established by means of equilibrium considerations and that 3° follows in 
first approximation from 2° by means of the stress-strain relations. 

The results of the theory are obtained in three main steps. First an appro- 
priate system of coordinates on the shell is introduced and certain geometrical 
relations established. Then the equilibrium conditions for an element of the 
shell are formulated. Finally the system of equations is completed by deriving 


the relations between displacement components and stress resultant components, 


3. The Codrdinate System. The location of a point of the shell is given 
by three parameters, two of which vary on the middle surface of the shell and 
the third along the normal to the middle surface. The condition we impose 
on the parametric curves is that they form a three-dimensional orthogonal 
system. This condition is imposed because for a non-orthogonal system of 
coordinates the expressions for the strain components and the relations between 
stresses, stress resultants and displacements are considerably more complicated 
than in the case of orthogonal coordinates. In what follows vector calculus 
is employed which simplifies the presentation considerably. 

The radius vector to a point of the shell may be written in the form 


where r denotes a vector to the middle surface; €, = const. and €, = const. are 
the parametric curves on the middle surface and € is the distance of the point 
from the middle surface measured along the unit normal vector n. 

We require that the line element has the form 


(2) ds? = A,*dé,* + + 
and find by a simple calculation that 


(3) ds? = d(r + tn) -d(r+ tn) 


is of the form (2), provided 


(4) Or Or Or On or On 0 


It is well known that (4) is the characteristic property of the lines of curvature, 

for which also 

on 1 Or 

(9) "> (m = 1,2) 
m OEm 

with F,, for the principal radii of curvature. With the notation 


or Or 
(6) dé, (m = 1,2) 


(1) R(é,, &, =r(&. + En(&, &) 


DEFORMATION OF ELASTIC SHELLS. 179 


and with (4) and (5) there follows from (3) 


(7) + + (1 aes + at? 
On setting 


——; (m = 1, 2) 


(5) t, = 


we have the following well-known formulae for the derivatives of the tangent 


unit vectors 


Obm 0a, Xm 
Atm 1 dx, (m = 1,2) 
| Zin 0Em (n — 2, ) 


which are subsequently needed. 


4. The Equilibrium Conditions. Consider an element of the shell 
hounded by surfaces = const., dé = const. and = +h/2. Forces and 
moments acting on all six faces must be in equilibrium. We denote by N the 
force resultants and by M the moment resultants, per unit of length measured 
along the parametric curves on the middle surface. N, and M, act on the 
faces normal to £, and Nz and M, on the faces normal to tz. By p we denote 
the external force per unit of area of the middle surface. Taking into account 
that stress resultants as well as areas change with the coordinates of the middle 
surface the following two conditions of force and moment equilibrium must 
he satisfied : 

4. 


(10) + == 0 
0&; 
0a.M da,M. ar ér 
(11) ——-+ aN, X — + XK —=0. 
0&5 


T'o obtain six scalar equations from (10) and (11), components of force and 
moment resultants with respect to normal and tangential directions are intro- 


dueed as follows: 


(12) N, = N18, + Mists + Qin 
(13) N. = Not, + Noote + Qon 
(14) M, = M,,t. + 

(15) M, = — + Most. 


1 Eqs. (9) are obtained by writing the derivatives of the ¢,,’s as combination with 
undetermined coefficients of » and the t, 8 and determining the coefficients with the 


help of (5) and (6). 


in 
0- 
al 
he 
1g 
Se 
al 
of 
(| 
Is 
re 
iT 


180 ERIC REISSNER. 

In these expressions moments are considered positive when they produce 
positive stresses on the part of the shell above the middle surface. They are 
represented as vectors such that the moments act in clockwise direction if one 
looks at the arrow head of the vectors. 
M, and M. is due to the 


while 


The absence of a third component in 
fact that the width adé of the faces is an infinitesimal 


the height h is finite. 


Introducing (12) to (15) into (10) and (11) we obtain 


0a.N,, , 0a,No, , 
04.0), , , @ on 
N dt, 
+a,{ No + Noe (pil + pol, yn) = 0 
0a.M,. , 04.M,, , 0a,M2, 
1% — Jt, — +- : 
ag, a, ( a, )' 
at, ét.\ at, at. 
+ J -M,,—-)+ 2, ( M..—— 
+ %2[ (Nits + + Qin) X + + + Yin) = 0. 


With (9), 


(5) and #, X t, =n this becomes 


Mie 21 


Rk, 


Ni2— N, = 0. 


t,, t, and m being linearly independent their coefficients in (18) and (19) 
have to vanish so that six scalar equations for the ten components of the 
stress resultants are obtained. The next step, expression of the stress re- 
sultants in terms of the stress components, will show however that only five 
of the six equations are relevant, since the coefficient of nm in (19) vanishes 


identically. 


i 
} 
i 
4 


DEFORMATION OF ELASTIC SHELLS. 181 


To obtain stress resultants in terms of the stresses 1,0, and 712 observe, 


h/2 
for instance, that by definition o,A,d€ where (2) and (7) 
-h/2 
contain the meaning of A,;. Dividing through by @, gives the desired ex- 
pression for \,,. In the same way follow expressions for the other resultants. 
They are: 


h/2 
i J h/2 ( ae e 


heey 


( 
a 
( 


In view of assumption 2° that the stress components normal to the middle 


T12 
T12 
(23) Nee 


surface are negligibly small. i.e. o; = 0, the stress-strain relations for iso- 
tropic materials which involve the stress components occurring in (20) to 
(20) are: 

To complete the system of equations the strain components have to be ex- 
pressed in terms of the displacement components, taking into account assump- 


tion 3° eoncerning the deformation of the normal to the middle surface. 


5. Determination of the Strain Components. We write the displace- 
ment vector in the form 


(25) U = U,t, + U.t. + Wn. 


The strain components tor a system of curvilinear coordinates corresponding 


toa line element of the form (2) are known to be 


(26) 


As 0 U. A, 0 

C a(W), A (0, DFW) Ae 


We proceed as follows to reduce (26) and (27), which give the com- 


| 


182 ERIC REISSNER. 


ponents occurring in the stress-strain relations, to their appropriate form. 
Since the normal to the undeformed middle surface is to remain straight 


we write for the displacement components 


(29) U =U + tw 
0g 
(30) W = W(é,, &,0) = w. 


With the help of the condition that the angle between the middle surface 
and normal remains unchanged by the deformation we may express wu’; and w’. 
in terms of w;, U2 and w. The changes of angle between the middle surface an 


normal being given by the strains yi¢ and yo we have 


(31) (yme)-_. = 9, (m= 1,2). 
According to (7) 


C 1, An (m= 1,2). 


so that with (29) and (30), (31) is equivalent to 


Um 1 Ow 
(32) (m= 1,2) 
bm Xm 


and thus wu’ is expressed by u and w. Substituting uw’; and wu’, from (32), 


the displacement components (29) take the following final form 


1 Ow 
(33) Um = Um —- € (m = 1,2). 
Res 
In this way the displacements are expressed—with little geometrical or 
analytical difficulty—in terms of the displacements of the middle surface. 
To reduce the strain components to their final form we make use of (30) 
and (33) and of assumption 1° that the thickness of the shell is small com- 
pared with the radii of curvature. 
We write 
(34) 


and replace the values of A» and their derivatives by their values on the 
middle surface, i. e. 


0A» Can 
35 An=&n, m =1,2). 
( ) m n at Rn» ( 


The strain components (26) and (27) then become 


(4 
(4 
(4 
(4 
(4 
(4 
vm 


DEFORMATION OF ELASTIC SHELLS. 183 


O 


w 
(37) (Us + fu’s) + 


Xo 0&5 Ao 


a fu.+t tu’, a, (“ + fu’, 
a, OE; ( )+ a» OE, 


which may be written 


y12° + 


In (39) the first terms clearly give the strains of the middle surface and the 


ice (39) =e," + = €2" + ko, Y12 
nil 
second terms the amount of bending of the middle surface. 

Introducing (36) to (38) into (20) to (23) and observing (34) there 


follows, with wv’ from (32) 


2 1 0a, Ww ] OU» Uy, 
E h 2 0 &5 R, 


41) —WN., = = — — — 


0a. WwW ( 1 Ou, Uo Oa, ) 


R, 


), (43) 12(1 x Us +y (2 u's 


12(1—r?) 1 du’. 1 dw’, Oa, 
%, % 1% OE, 
The derivation of these stress-strain relations seems simpler and more direct 
ud than those given before. This is due to the fact that one works with the general 
strain components for curvilinear orthogonal codrdinates and introduces into 
them the assumption that the normal to the undeformed middle surface 
is deformed into the normal to the deformed middle surface. Then one 
introduces the assumption that the shell is “thin” which shortens the re- 
e sultant formulae greatly. This last step which in some form is usually intro- 
duced into the shell equations need however not be taken. 
When Eqs. (40) to (45) are introduced into the first five of Eqs. (18) 
and (19) there remain five equations for the five unknown w, ws, w, Qi, Qe. 
the solution of which constitutes the analytical part of the theory. 


1 

(42) Noo = €2° + ve,® = — 

Gh a, a» J& \a, 


184 ERIC REISSNER. 


6. Concluding Remarks. Solutions of the shell equations have so far 


been found for cylindrical shells and for shells of rotational symmetry under 
axi-symmetrical load. No mathematical difficulty arises in the case of 
circular cylindrical shells, at least for most of the interesting types of boundary 
conditions. The beginnings of a theory of the general cylindrical shell are 
to be found in a recent paper by A. A. Jakobsen [7]. The theory of shells 
of rotational symmetry is due to H. Reissner [8] and KE. Meissner [9], the 
former giving the solution for spherical shells, the latter showing that II. 
Reissner’s method was applicable to the entire class of shells of rotational 
symmetry. In addition to these results there is a solution for spherical shells 
with unsymmetrical distribution of load by EK. Schwerin [10]. An account 
of the results of the theory of circular cylindrical shells and of shells of 
rotational symmetry may be found in Love’s Treatise and in books by W. 
Fluegge [11], C. B. Biezeno and R. Grammel [12] and S. Timoshenko [13], 


who also give further references. 


MASSACHUSETTS INSTITUTE OF TECHNOLOGY. 


BIBLIOGRAPHY. 


l. A. FE. H,. Love, Philosophical Transactions of the Royal Society of London, vol. 179 
(1888), p. 491. 

2. E. Mathieu, J. Heole Polytechn., vol. 51 (1883). 

3. H. Aron, J. Reine Ang. Math., vol. 78 (1874), p. 138. 

4. F. Krauss, Mathematische Annalen, vol. 101 (1929), p. 61. 

5. E. Trefftz, Z. Ang. Math. Mech., vol. 15 (1935), p. 101. 

6. F. K. G. Odquist, C. R. Ac. Franc., vol. 205 (19387), p. 205, p. 271. 

7. A. A. Jakobsen, D. Bauingenieur, vol. 28 (1937), p. 418, p. 436. 

8. H. Reissner, Festschrift H. Mueller-Breslau, Leipzig 1912, p. 181. 

9. KE. Meissner, Physik. Z., vol. 14 (1913), p. 343. 

10. KE. Schwerin, Dissertation T. H., Berlin 1917, J. Springer, Berlin 1918. 
1]. W. Fluegge, Statik u. Dynamik der Schalen, J. Springer, Berlin 1934. 

12. C. B. Biezeno and R. Grammel, Technische Dynamik, J. Springer, Berlin 1939. 
13. S. Timoshenko, Theory of Plates and Shells, McGraw Hill, 1940 


\ 
{ 
q) 
t 
| 
( 
hy 
» 
t] 
> 
ti 
Ve 


ORTHOGONAL POLYNOMIALS DEFINED BY DIFFERENCE 
EQUATIONS.*+ 


By Ot1s E. LANCASTER. 


1. Introduction: Many analogous properties of differential and differ- 
ence equations have been studied. Here these analogies are extended to include 
some ideas relative to orthogonal solutions of difference equations. 

Although some general theorems are given, the main study is confined to 
polynomial solutions of difference equations of the form 


( 1) + bar +c )A*y(x) + (dvr + f)Ay(a) + + h) =0 
h h 


where h > 0 is the interval of difference, a, b, c, d, and f are constants and A 
is a parameter which is determined so as to insure polynomial solutions.’ 
After making a new definition of an adjoint equation, called the L-adjoint 
difference equation, it is proved that every second order difference equation 
can be put in L-self-adjoint form. Then it is shown that the solutions cor- 
responding to characteristic values of A are orthogonal (in a sense to be defined ) 
on some interval. Special properties of these orthogonal functions are devel- 
oped. Included among them is a difference form and a recurrence relation for 
the polynomial solutions of (1). The results are shown to reduce, in the limit as 
h— 0. to the known facts for differential equations. The general theory is 
illustrated by the polynomials analogous to the Legendre and the Hermite 
Polynomials which were studied by Jordan ? and Greenleaf.’ respectively. also 


by polynomials analogous to the Laguerre and Jacobi Polynomials. 
por 


2. Definition of S-orthogonal functions.’ .1 sequence of functions 


* Received April 13, 1939; Revised July 1, 1940. 

t Presented to the American Mathematical Society, December, 1938. 

*We employ the symbol! Ay to mean y(@ +h) —y(@) 

h h 

* Charles Jordan, “ Sur une série de polynomes dont chaque somme partielle repré- 
sente Ja meilleure approximation d’un deeré donné suivant la méthode des moindres 
carrés,” Proceedings of the London Mathematical Society, vol. 20 (1921), pp. 297-325; 
(. Jordan, “Approximation and Graduation According to the Principle of Least Squares 
by Orthogonal Polynomials,” Annals of Mathematical Statistics, vol. 3 (1932), pp. 
257-357. 

2H. FE. H. Greenleaf, “ Curve Approximation by Means of Functions Analogous to 
the Hermite Polynomials,” Annals of Mathematical Statistics, vol. 3 (1932), pp. 204-256. 

‘ This definition reduces to the one given by Gram if the interval [z, vy] is a multiple 
of the interval of difference. See J. P. Gram, “ Ueber die Entwickelung reeller Func- 
tionen in Reihen mittelst der Methode der kleinster Quadrate,” Journal fiir Mathematik, 
vol. 94 (1883), pp. 41-73. 


185 


186 OTIS E. LANCASTER. 


with a weight function g(x), if 


9 (©) Ar = 0, km, 


h 


where the operation ® S is the inverse operation of A; 
h 
that is, if 


(L) bm (x) Ax = Ag(x), 
h h 


then 
S g (x) dk Ar = g(r) | 


h | 


= q(v) — q(x). 


3. L-adjoint difference equations. In the study of linear homogeneous 
differential equations, a necessary and sufficient condition for a function v(.7) 
to be an integrating factor of an n-th order equation L(y) =O is that r(r) 
satisfy an n-th order differential equation L(v) = 0. This differential equation 
L(v) =0, is called the adjoint equation. There exist three fundamental 
relations between a differential equation and its adjoint. First, the relation 
is a reciprocal relation, that is, if L(v) = 0 is the adjoint of L(y) = 0 then 


L(y) =0 is the adjoint of L(v) = 0. Second, Lagrange’s identity 


v L(y) —yL(v) =— {P(y,v)} 


is satisfied, where P(x) is linear and homogeneous in y, y’, y” — ae 


well as in v, 0’, Third, a differential equation may be self- 
adjoint, that is L(y) = 0 may be identical with its adjoint. 

Unfortunately, a linear homogeneous difference equation and its classical 
adjoint difference equation do not possess the above mentioned properties. 
The relation is not a reciprocal one, there is no exact analogue to Lagrange’s 
identity and no equations are self-adjoint unless the coefficients are periodic 
functions of x. Since these relations are so fundamental in the study of S- 
orthogonal functions, we are led to make a new definition for an adjoint 
equation. 

Consider a linear homogeneous difference equation of even order, 


(2) L(y) = po(x)y(x + 2nh) + pi(x)y(a + 2n— 1h) +- 


+ + nh)+: +++ pon(x)y(x) = 0. 


5 This definition of the definite sum differs by a constant from the value obtained by 
replacing x by 6 in the principal sum defined by Nérlund. Hence the sum exists for all 
functions which are summable in the Noérlund sense. N. E. Noérlund, “ Differenzen 


rechnung,” p. 43. 


(li 


(3) 
W 
( 
( 
] 
a 
hi 


ORTHOGONAL POLYNOMIALS. 18% 


Suppose there exists a function v(x) such that v(x-+ nh)L(y) is a perfect 
difference. Then by virtue of the identities 


+ mh) pon-i(x)y(x + th) = 
hal v(x + (x—h)y(x 4 1h) 

+ + 2n- 2n — th) pon- + n —ith)y(a+ nh) | 

y(x + nh) poni (x + n — n—ih)v(x 2n — ih), 
(3) 4 + nh) pon-i(x)y(x + th) 

nh) pon-i (x + n — th) v(x +- 2n — ith), 

+ nh) pon-i(x)y (a + th) = 
| — v(x + Th ) Poni + n— Ih) 
+ nh) pon-i(x)y (a + th) | 
+- y(a@ + mh) pon-i(@ + n—ih)v(a +2n—ih), i<n: 


we obtain 


(4) r(x + nh) L(y) + nh) L(v) = A{P(2, y)}. 

where 

(5) L(v) Pon(x + nh) ing n— Ih) v(: r+ 2n—Ih) 
+--+ nh) po(x —nh)v(z) 

and 

(6) P(v,y) =h tn 16) ) Pon-1(@ —h)y(a@+i- 


+ 2n — ih) (t+ — — th )y(x + nh) 


h > v(a + 1A) (a n—1it—Ih)- 


+--+ -+ v(a+ nh) poni(x)y(e + th). 


DEFINITION: The difference equation L(v) =0 shall be called the L- 
adjoint difference equation of the even® order difference equation L(y) =0. 


THEOREM 1. The L-adjoint relation is a reciprocal one, that is the L- 
adjoint of L(v) —() 1 L(y) = 0. 


The proof of this theorem follows immediately from the identities (3). 
The relation (4) is a direct analogue of Lagrange’s identity for differential 
equations. 


DEFINITION: When a linear homogeneous difference equation is identical 
with its L-adjoint, it is L-self-adjoint. 


* The generalization of this definition so as to include odd order difference equations 
has been omitted, since it has no advantages over the classical definition of an adjoint 
equation. 


= 
i ) 


188 OTIS E, LANCASTER. 
THEOREM 2. A necessary and sufficient condition that the difference 
equation of even order L(y) = 0, be L-self-adjoint 1s 
(7) Pi(®) = + —th) 
This is immediate from the definition of the L-adjoint equation. 
THEOREM 3. Every second order linear homogeneous difference equation 
can be put in L-self-adjoint form. 
Proof: First, a difference equation of the form 


(8) A(w(r)Ay(er)) + +h) =0 
h h 


is L-self-adjoint. This follows from Theorem 2, for (8) is equivalent to 


wieth)y(e+ 2h) — +h) + w(r) +h) 
+ = 0. 


Second, every linear homogeneous difference equation can be put in the 


form (8) by multiplying it by a certain factor. Given a difference equation * 


(9) + + + h) =0. 
h h 
if (a) is such a factor. then 


h 


= h)A*y(r) + Ad(x)Ay(2). 
h / h 


h h h 


Ilence, 
\ = o(4 +h) 
t(r)r(r) = Ad(r) 
h 


OF, 
+h) —t(a)q(r) =hl(a +hyr(a+h). 
(vr) 
( .. =exp(S—log 


q(x +h) hr(a +h) -0. 
When g(z +h) —hr(x +h) =0 the equation (9) reduces to a first order 


difference equation. Q. E. D. 


4, S-orthogonality of Solutions of a Second Order Difference Equation. 
Suppose there is an infinite sequence of distinct values of A: Ao. Ar, Ags + 


7 Since a difference equation may be written in two forms we take the liberty to use 
the form that is most convenient for the point in question. 


4 

‘ 
\ 


189 


ORTHOGONAL POLYNOMIALS. 


An,’ °° such that for each of these values the linear L-self-adjoint difference 
equation 

(11) + +h) =0 
has a solution which satisfies given boundary conditions. If the solution cor- 


esponding to Ay is denoted by yn(x), then 
)) + s(x) Anyn (a + h) =0 


A(w(xr)Ayn(@) 
h h 
S(L)AmYm (x h) 


A( w(x) Aym(x)) 


Upon multiplying the first of these by ym(x + h) and the second by yn(x + h 


and utilizing the formula 
A[u( x)v(r)| = +h) Au( r) + 


we obtain 
+ 


h)[w(a + h)A®yn(2) 
h 
+ h)ym(z +h) =0 


Ym (x + 


+ h) | “Ym ( tv) + (x )Aymn (x) | 
h 
+ + +h) = 
identities from the first and adding and sub- 


Subtracting the second of these 

tracting the term w(a + h)Ayn(a@ + h)Ayn(x +h) to this result, we have 
h 

{Ym ( h ) Ayn (x )— Yn(x h )Aym (x) 


A[w(x) { 
(Am — An) + he) ym (+h). 


Hence, 
W(X) {Ym(x + h)Ayn( x) — + h)Aym (x) } | 
h 


(12) (x) {i 
= (An — An) Ss(r) + h) ym(a + h) Az. 
h 


yp 
Therefore, if the definite sum S s(x) + h)ym(a@ + h) Az exists, if 
“ h 
and if the functions Ayn(7). 
h 


then the solutions 


y, 


vanishes for «=p and 
and T= 


+h) and Aym(z) are 
S-orthogonal on the interval [p,v] with 


yi(e@+h),: are 
Or, the sequence of functions {yn(x)} 


respect to the weight function s(2) 
-orthogonal on the interval [p+ h,v+h] with the weight function s(@—h) 


Consider the linear homogeneous difference 


» finite for 


5. Polynomial Solutions. 


equation 


| 
0. 
) 


190 OTIS E. LANCASTER. 


(13) pa(w)A"y(2) + j) =0, 


where pi(x) is a polynomial of degree < i, and j is any constant. Define 


n- Lo 

(14 6 + 

where 


THEOREM 4. A necessary and sufficient condition that (13) have a poly- 
nomial solution ($£0) is that the equation 


(15) 6(p) =0 


have a non-negative integral root. If there is a polynomial solution of degree m, 
then® 6(m) =0. If k is the smallest non-negative integral root, there is a 


solution of degree k, and there is no solution of degree less than k. 


Proof. If one substitutes + dy +--+ bo in equation 
(13), then a necessary and sufficient condition that this be a solution is that 
the coefficients of all powers of x on the left be zero. The coefficients of x”, 


t+ + + are readily seen to have the form 


6(m), Din-O(m 1) + Cm-1, Om-20(m — 2) + Cm-_2,° 


respectively, where the c’s are determined by the ajj’s and m. Hence if there 
is a solution of degree m, then @(m) —0. Conversely, suppose there is a 
non-negative integral root of 6(p) = 0 and let & be the smallest such. Then 
6(k—r) ~0 for r—1,2,---,m but 0(k) =0. Hence the b’s can be 
uniquely determined so that the above coefficients vanish. That is, there is 
at least one polynomial solution and one of the solutions is of degree /. Since 
4(m) must be zero for a solution of degree m, it follows that there is no 
solution of degree less than /. Q. E. D. 
The same argument leads to precisely the same conclusions regarding the 


corresponding differential equation 
From this fact follows at once the 
Corotuary 4.1. Jf (16) has a polynomial solution so does (13) and 
conversely. Moreover the minimum degree of all polynomial solutions of (13) 


is the same as that of (16). In particular, if (16) has only one polynomial 


* If 0(m) 0, it does not, however, follow that there is a solution of degree m. 


ly- 


1€ 


ORTHOGONAL POLYNOMIALS. 191 
solution, then (13) has a polynomial solution of the same degree and con- 
versely.® 


Theorem 4 insures that (1) has a polynomial solution of the m-th degree, 


when m is the smallest integral value for which 


(17) am(m—1)+dm+A=0. 
So, when dha, k a positive integer,’® there exists an infinite sequence of 
distinct characteristic values for A: Ao, A1,Az* such that (1) is satisfied by 


a polynomial of m-th degree when 
A= Am = — am(m — 1) — dm. 


When (1) is written in its L-self-adjoint form (11) 


*In a way, this result is very surprising, for in general the solutions of difference 
and differential equations are quite different in nature. It is important to note precisely 
the statement of the corollary. Although there is always a polynomial solution of (13) 
that is of the same degree as a polynomial solution of (16), the two solutions are not 
the same. Moreover (16) may have more polynomial solutions than (13) and con- 
versely. For example 

toe + 3)y” + (—6r+4)y’ + 12y=0 
is satisfied by 
y 18272 + + 114, 
but 
(1 + 3)A*y (x) + (—6r + 4) Ay(x) + 12y =0 

h h 

ix not satisfied by any polynomial of the fourth degree. 
The corollary cannot hold if the degree of p;(#) is greater than i. For example: 
+ 3x?) y” — (127 — 36) y = 0 
s satisfied by y r* but 
(av? + 32") (a) — — 36) y = 0 


does not have a polynomial solution. 

Moreover it is evident that the corollary could not be extended to include non- 
linear equations. For, if a non-linear algebraic difference equation of q-th degree 
ud n-th order has a polynomial solution of degree m, then the m coefficients of the 
polynomial must satisfy m2—1 relations and in general this is not possible. For 
example: 

(y’)? —4y = 0 
has a polynomial solution y = «? but 
[Ay (a) ]?—4y = 0 
does not have a solution of the form 


Y =A) + aye + 


OTT ka then X, may equal A,,, nm, And if n > m there may not be a 
polynomial solution of degree n. For a relation equivalent to (17), see E. H. Hilde- 
brandt, “Systems of Polynomials Connected with the Charlier Expansions and the 
Pearson Differential and Difference Equations,” Annals of Mathematical Statistics, vol. 
2 (1931), p. 405. 


m, 
at 
re 
a 
ye 
is 
10 
d 
) 
tl 


192 OTIS E. LANCASTER. 


a(a—h)? + b(w—h) + 


(18) w(x) = [a(a—h)? + b(a@—h) + exp(S 


Thus, if 2, and 2, are two real zeros of w(x), the results of section 4 show 
that the sequence of polynomials {y,,(a2)} are S-orthogonal on the interval 
[a, +h, 2. +h] with the weight function 


i((r—h)? + ) 
-- bx e—h(dx +f) 


(19) Cr) exp(S log 


provided that the definite sum of this weight function on that interval has a 
meaning. We see, moreover, that if the weight function (21) is different from 
zero everywhere, the interval of S-orthogonality is [p.v], where »— 2h and 
vy — 2h are the roots of the equation 


6. Difference Form. of the Polynomial Solutions of (1). If we set 
y(z) = A"*"z(z) 
the equation (1) becomes | 


(21) (ar? 4+- br + c)A"**z(x) + (du + fy An**z(r) 4- AA" (a7 +h) =0. 

h h h 
And upon summing n + 1 times by means of the generalized Leibnitz theorem 
for summation, 


n 
(22) =u(a—h)A"r (x) 7 (2) 
h h h 


n(n- 1) 


A?u(2—n— 
h h 
we obtain 


1h)? +b (x—n 3h). 4 c|A*2z(2) 
h 
(23) + [dre + f+ (n+ 1)*ah — (n + 1) (2azr + |Az(x) 
h 


+[(n+1)(n+ 2)a— (n+1)d+A]z(a +h) =0. 
If 
(24) A— (n+ 1)d+ (n+1)(n+4+ 2)a=0 


this reduces to 


11 Since the operators A-* (k = positive integer) of (22) lack uniqueness, the right 
h 
hand side of (23) should be some function f(#) whose n-th difference is 0. If however 
we choose f(x) =0 and use (23) to define Z(#), then by differencing we obtain (21), 
so that y(a2) =A™'z(xr) is a solution of (1). Hence we may take (23) in its homo 
h 
geneous form. 


h ax? +- ba +- (dx + 


| 
| 
| 
0 


it 


ORTHOGONAL POLYNOMIALS. 193 


+ 1h)?+ b(a@—n-+ Ih) + c]A*z(z) 
h 
+ [dx + f+ (n + 1)2ah — (n + 1) + = 0. 
h 
And by a simple summation, we have 


Az(«) = exp(— S= log — n+ Ih)? + + th) +e, 
h I ax® + br + c— h(dz + f) h 


Ilence, 


a(x—n-+ 1h)? + b(a—n-+ Ih) +e 
9? n Ar 


Now, if v(x) is a solution of the Z-adjoint equation of (1). then 
v(a h)| (av? + be + 6) + (dx + f)Ay(x) + h) | 
h h 
= | M(x)Ay(z) + H(x)y(«)]. 
where 


= v(x) [a(a—h)? + b(a—h) +e] 
hH (a2) =he(« +h) (de +f) + v(x) [a(a—h)? + +] 


(ar? + br +c). 
Therefore, a first summation of the equation (1) is 
+ = 0. 
h 


Whence, 
u(a + h) (ax? + ba + c) — (dx + fyhe(e th) 


= exp(S> log v(x) + b(a@—h) +c] 


or 

ax? +b e— (dxr+f)h 
2 — 
(26) y (x) exp(S i log Av). 


h 


The adjoint of (1), as determined by the method of section 3. is 
(27) [a(a +h)? + (b—hd) (a +h) +e — 

+ [(4a— (20—f— dh) yaw (x) + | th) =0. 
This difference equation is of the form (1); hence it has a solution of the form 
(25) when a condition analogous to (24) is satisfied: that is, if 

(2a —d+ 2) — (n+ 1) (4a—d) +(n +1) (n+ 2)a=0 


or, what is equivalent, 


(28) n(n—1)atdn+aA=0. 
Thus, when (28) is satisfied by an integer n, 


13 


al 
a 
|_| 


194 OTIS E. LANCASTER. 


nh)? + (b—dh)(x—nh) +c—hf 
a(r—h)? + b(a—h) +c 


)| 


log 


A"fexp(— S 1. 
h h 


and 

ax* +- bx +c —h(dzr-+ f) 

a(x—h)*? + b(2—h) 
a(r—h)? + b(a—h) +e Acti 

nh)? + (b —dh) (x4 — nh) + 


(29) yn(z) = exp(S= log 


1 
A"fexp(S — log 
h h 


The condition that A = — n(n —1)a—nd, where n is an integer was 
sufficient to insure that (1) has a polynomial solution. Can (29) be a repre- 
sentation of these polynomial solutions??* We shall see that the answer is 
in the affirmative. 

In order to discuss the various cases which arise for the solutions (29), 


it is convenient to let T(x) denote a solution of the difference equation 
s(x +h) —as(x) =0; 
z, and a denote the roots of the equation 
a(x—h)* + +c=0; 
and £, and £2 denote the roots of the equation 
ax? (b— dh)x +ce—hf =0. 
Case I. a=40. In terms of the above notation, (29) may be written as 


Tn (a — a) Tr(a — as) Tr(a — Bi — nh) Tala Bz — nh) 


Yn(z) = 


or 
Ty, (2 — 2, ) 


(30) = 


x h ( B, ( B (2x — B2) 
This expression is a polynomial for any positive integer n. Hence, it is the 


difference expression for the polynomial solutions of (1). 
Case II. a=0,b~0, b—dh+~ 0. 
(2 — B:) (7 — h 


[ h — B1) 


12 When h = 1, the solution (29) can be identified as the polynomials Q, (, x) of 
Hildebrandt’s paper referred to in footnote 10. The identification is made evident when 
one observes that t(#) is a solution of a difference equation of type (1) on page 421 
and that Q,(”,) is a solution of the second order difference equation XIV, on page 433. 


(31) = 


as 


ORTHOGONAL POLYNOMIALS. 195 


Case IIT. a=0,b6=0,d~0,c0. 


a/h 
) dh 


Case IV. a=0,b4~0,b—dh=0, c—hf ¥0. 


ve n\v) = n 


In the last three cases the expressions for y,(a) are also polynomials for 


integral values of n. Hence, we do have the difference form of the polynomials 
of 

7. Recurrence Relation: 

THEorEM 5. The polynomials y,(x) satisfy a recurrence relation of the 
farm 
(34) + [B(n)a + C(n) (x) + D(n)yn(z) =0 

Proof: The proof is divided into four cases corresponding to the four 
forms of yn(a): (30), (31). (32), and (33). 

(‘ase I. If the theorem is true for y,(«) defined by (30), then 


— B,) Ta( — Be) 


nyr+ C(n)] 


An (a h (¢ —h — Bo) 


h 


(27 — 2, ) @) 

— Bi) Bz) 
(7 — B:) (x7 — B 


h 


which, after » summations, yields (the particular result) 


A(n) A? 


h 


+ [B(n)x + C(n)] 
— a, +h) a, +h) 
Bi +h) Ta(a@— B2 + h) 


(2 — Tr(a — 
(nh) (7 — R, (nh) 
D(n) (a Bi h) (1 B: h) — B:) Ta(x — Be) 


— 2,) Ta (2 — ae) 
r 3, ) (n+ n+2h) 
[ — Bi) — Bo) 


nB(n) (nth) (7 — 


** This constitutes a second proof that (1) has a polynomial solution when 
A = — n(n— 1)a—nd. 


The first theorem is given because it is of interest in itself. 


is 
if 
n 
3. 


196 OTIS E. LANCASTER. 


After effecting these differences and removing the common factor 


(2 — Ta(a — 22) 


we obtain 
A(n) {(a + h — a) a) + h — — — 2(4 — (7 — 2.) 
(2 — n-+1h — B,)(« —n-+ ith —RB:) 
+ (cx—n-+ 1h—B,)(ax—n + 1h) + 2h — Bo) (47 —n + 2h 
+ B(n) {ha(a — 2,) —ha(a— n+ 1h + ih — B:) 
— h?n(.r — 2,) (a — 
+ C(n) — 2,)(«# — &) n+ 1h — Bi) («— n+ th — B.)} 
+ h?D(n) =0. 
If this relation is to be satisfied for all values of x, the coefficients of the powers 
of x must vanish. The coefficients of «* and «x* vanish for all values of A, B, 
(', D. Equating the cofficients of .r?, 2’ and x° to zero we obtain three linear 
homogeneous equations in the four unknowns. Hence there is always a non- 
trivial solution. ! 
When A is chosen so as to avoid fractions as much as possible a solution is 


(A =— (n+ d/a)(2n+d/a) 

B= (2n + d/a) (Qn + 1+ d/a)(2n + 2+ d/a) 

| 


(35) — (d/a—2)(2n+1+ d/a)(n+1-+ nd/2a+ dsayh 
« 
— 1+ d/a)(2n4+ 2+ d/a)(2n4+ 1+ nd/2a+ du 


| D= + 1) (Rn + 2 + d/a) — (2n+ d/a)* 
| ( 2a" 
| 
| 4-n(n $1) (n+ d/a)(2n + 2-4 d/e) 

a" 


+n?(n+1)(n+ d/a) (2u + 2+ d/ayl’. 


Cases IT, [1], and IV. By carrying through exactly the same steps as in 
Case I we can show that there is always a non-trivial solution for A, B, C. 
and D. 

For Case II, where yn (x) is defined by (31), a solution for A, B. C, and 
D is 


14 The author wishes to thank Mr. W. S. Cramer for checking the evaluations for 
A, B, C, and D. 

To obtain these values for A, B, C, and D it is convenient to obtain the three 
relations by letting equal a,, (n+ 1)h +, and (n+ 1)h-+ 

Although (35) was obtained by assuming that (34) is true, it is clear that the steps 
can be retraced, so that from (35) follows the truth of (34). The same remark applies 


to the other cases. 


{ 
| 
1 
W 


ad 


cos 


ORTHOGONAL POLYNOMIALS. 19% 


(A= dh—b 
| 

(n+ 1) dh —nb |. 


\ solution for Case IIIT, where instead of (32). we have 


dh 
Yn(v) = Bi) 
1 h 
[ l B;) B:) (= ) 


Is 
ft=1. B=l, C=—(n+2)h+f/d 
(4 } D=—(n+1)c/d 
And a solution for Case LV, where yn(a) is detined by (33) is’ 
\ 1 =fh—c, B=d 
(od) 


+ bn, D=(n+1)d 


8. Orthogonality and Zeros of the Polynomials.'* In this discussion. 
we first consider the last three cases. 


Cause I]. The weieht function (19). which in this case is 


— Bi) b— dh ; 


has poles at the points a, a,—h, 2,—2h.- +--+ and zeros at Bi, Bi— h, 
— 2h. Thus. if bd < 0. the sequence of polynomials is S-orthogonal 
on the interval a, +h, For then, b(b —dh) > 0, w(x) vanishes at the 
two points «= 2+ and «= x and the weight function is summable over 
this interval. And. if bd > 0.b > dh and if 2, = B, kh, (k > 0), then the 
polynomials are S-orthogonal on the interval (— x.2+h]. 


THEOREM 6. Jf bd < 0, the zeros of the polynomials of Case II are real 
distinct and lie on the interval [a +h, 2]. If bd >0.b > dh and a, = Bp, 
the zeros of the polynomials are real distinct and lie on the interval (— x. 
a+h|. Moreover, if fd > cd and either bd <0, or bd > 0 and b> dh then 
the zeros of Yn(r) separate those Of 

Jf Py. Py, form a sequence of S-orthogonal 


* Note that the difference form and the recurrence relation for the polynomials hold 
Whether the polynomials are S-orthogonal or not, that is, even if the weight function is 
not summable over the interval in question. 


198 OTIS E. LANCASTER. 


polynomials, and if P»(a) is of the n-th degree, then P, is S-orthogonal to any 


polynomial of degree less than n. 


The proof of this lemma follows immediately from the fact that any 
polynomial may be expressed as a sum of the P’s. 


Proof of the theorem. If fb > cd, and either bd < 0, or bd >0 ani 
b > dh, then in the recurrence relation (34), AD > 0 for all n, hence the poly- 
nomials form a generalized Sturmian sequence.’® Therefore, the zeros of y,(.°) 
are real, distinct, lie on the interval of orthogonality and separate those of 
Yn-1(@). 

If fb Scd and bd < 0, then B, <4, +h and the weight function is 
positive over the entire interval [a +h, oo). Hence, if 


S Yn (xr) g(x JAr 0, 
h 


ath 
there is at least one zero of y,(a) on the interval. Assume that there are less 
than n; then 

Yn(@) = (@— W,) (@ — we) W (2), (k <n), 
where W(x) a polynomial of degree n —/:, which has no real zeros. By the 


lemma 


ao 
S yn(x) — w,) (@— we) — we) Ar = 0 > 0). 
h 


ath 
or 
a 
S (x— w,)?(a#— — W (2) g(x) Ar = 0. 
ath h 


The last statement cannot hold unless W(2) vanishes on the interval. This 
contradicts the assumption that W(a) has no real roots. Hence all roots of 
Yyn(x) are real and lie on the interval [# +h, «). 

Suppose that they are not distinct. then 


Let Z(z) represent a product formed by taking one factor from each of the 


factors of odd multiplicity. Now 7(x) is of degree less than that of yn(a), so 


S yn(x)Z (x) g(x) Av = 0. 
h 


ath 


16 See M. B. Porter, “ On the Roots of Functions Connected by a Linear Recurrent 
Relation of the Second Order,” Annals of Mathematics (2nd series), vol. 3 (1901-1902). 


pp. 55-70. 


} 
] 
t 
h 
a 


ORTHOGONAL POLYNOMIALS. 199 


Again the function in the summation is of one sign, hence its sum cannot 
vanish. Thus the roots are all distinct. 

If bd > 0, b > dh, fd S cd and if a, = B, then a similar argument holds 
concerning the zeros of the solutions. Q. E. D. 


Case Ill. If cd < 0 the weight function is real and summable over any 


interval. Hence, the polynomials are S-orthogonal over the interval 
[B— jh, (j = 0,1,2,°- -) 


Moreover, when cd < 0, AD > 0, hence the zeros of yn(x) are real, distinct. 
lie on the interval [8, 0%) and separate those of yn_.(x). 

Case IV. If d(fh—c)>0, then in the recurrent relation (34). 
AD > 0, so the polynomials form a generalized Sturmian sequence. Therefore 
the zeros of (x) are real, distinct, and separate those of Since 
vanishes at only one point the polynomials are not S-orthogonal. 


Case I. The weight function 


— Bi) Bz) 


has zeros at 
x= Bm -+ jh (m = 1,237 =0,1,2,-- -). 
and poles at the points 
Am + jh (m = 1, 2;7 = 0, 1, 2,- - -). 
If our weight function is to be finite over the interval of summation then either 
1) a+th> a, 


or 


2) Bu (m == 1,2), 
or 


3) Xe = Bn — jh (m = 1,2; j= 0.1, -). 


In cases 1) and 2) the polynomials are S-orthogonal over an interval of length 
less than h, so are of little importance from a practical standpoint. 
If 
= Bm — jh (m = 1,237 = 0,1, 2,- - -) 


the interval of S-orthogonality may be of any length («,, may be any number. 
% <a) and the weight function is 


200 OTIS E. LANCASTER. 
If, moreover, %, = 8, — ih, then the weight function is 


(39) Bn + j — 1h) (2 —B, + i— 1h), 

It follows immediately from (35) that these polynomials do not satisly 
Porter’s conditions which are sufficient to insure a generalized Sturmian 
sequence, for regardless of the values of the constants a, b, c, d and f and the 
magnitude of /, if n is sufficiently large 1D < 0. This does not prove a thing, 
but it suggests that the zeros of the polynomials may not be real, distinct, ete. 
Upon examining the first five polynomials of a special case of the polynomials 


treated by Jordan, viz., the solutions of 


+ 4r + + (24 + 3)Ay(r) —n(n 1) 


namely, 
U,= 1. U. = 3/2(z7 +2), 


we see that this is the case, for Uy has two complex roots. Many other examples 


exhibit the same properties. So, we may state that in general the roots of the 


polynomials of Case I are not all reals? 


9. The evaluation of S y,°(r)g(r)Ar. Let the polynomials be S- 


“ h 
orthogonal on the interval [».v] with a weight function g(c). 


equation (34) by yn(v)g(2) and summing from p» to v we obtain: 


Multiplying 


v v 
Bin) S LYnsi Av + Din) S Yn" g(x) Ar = 0. 
h h 
Upon reducing the subscripts of (34) by one, multiplying the relation by 
Ynsi(©)g(x) and summing from p» to vr, we have 
A(n—1) 8S + B(n L) Arc = 0. 
h h 


Whence. if 


17 1t should be noted that this is not contrary to the statement made by L. Fejér in 
a note at the close of the first of the two papers by Jordan mentioned in footnote 1. 
Fejér proved that y,,(@), a solution of 
(r—a-+ 2h) (a—b + 2h) A*y,, (x) 


h 
+ [227 —a—b+ 3h —m(m + 1)hJAy,, (2) — m(m + 0 
h 
. 
has real, distinct zeros which lie on the interval [a,b—h] provided m < is This 


provision was not strongly emphasized in the paper of Jordan in the Annals of Mathe 


matical Statistics, vol. 3 (1932). 


i 


ORTHOGONAL POLYNOMIALS. 201 


B(n)A(n— 1) 


B(n—1)D(n)” 
‘ Ay = Ss n wv Ax 


Therefore, if A(v). B(n) and D(n) are different from zero for all we may 


evaluate 


v 
S yn? (2) g(a) A. 
h 
For 
(40) S yn? (v)g(v) Ar 
h 
B(0) D(n — 1)D(n—2):--D(O) 
and 


S yo" (0) g(r) Ar 


h 

may be found by direct summation. 

10. Examples. ‘I'o illustrate the above general theory we consider some 
special cases. 

a. Analogue Legendre polynomials: The polynomials studied by 
Jordan satisfy the difference equation. 

+ | —a—b + 3h —m(m+1)h (27) —m(m + (27) = 0, 

which may be written in the form 
(r—a+ 2h) 2h n 


h 


- [22 —a—b + 3h (x) m(m + +h) = 0. 
h 
Formulas (19) and (20) show that the polynomials are S-orthogonal on the 
interval [a,b] with a weight function 1: formula (30) gives 


Un (x) A” | (ar h) 
h 


and from the values (35) we obtain 
UO min T (2m +- 3) (22 a—b + 
(a2) = 0. 


When we take into consideration that the polynomials Q» (2) of Jordan are 


l 
multiplied by the factor —— . we obtain his recurrent relation: 


y . 
n 
Is 
( 
y 
n 
) 


202 OTIS E. LANCASTER, 


4(m + 2) Qms2(2) — 2(2m + 38) (2a —a—b +h) 
+ (m+ 1)[(b—a)?— (m + = 0. 


If b—a=nh, formula (40) yields 


(n* — 1) — 27) (n? — 37) — m?). 


b 
S Qmn2(x) Art = 
a h 4™(2m 4-1) 


b. Greenleaf’s analogue to the Hermite Polynomials: Greenleaf studied 
the polynomials satisfying the difference equation 
(p— 1) + 2(n — — 1) + = 0 
or 


(p— «x — 1) — + 1) + 2nd(a +1) = 0. 


From (18) we obtain 


w(x) = (p—«)exp(S log Ar) 
= (p— w)exp(S log (p—z—l1 )! Ar) 
(p—axr—1)! 
1 
2) 
and since 0! = 1, w(x) =0 when = p and = — p—1. 
Hence, 
p+1 1 
S nPm Ar = 0), 
(p—ax)!(p+or)! 


or 
pri 2p! 


S jo 


r= (), 


The formula (29) yields 


on(xz) = 1) 


Taking into consideration the factor (— 4)" in Greenleaf’s polynomials, we 


obtain from the values (36) the recurrent relation 
Adnio(@) — (47) + (n+ 1 ) (2p = 9. 


And (40) shows that 


1/9 n 
S : (2p) 


(p—z)! (ptr)! 2?" 9(2p)!’ 


or 
2p! 2-*p nt(2p)" 


P 
S 


slp (p—z)!(p+a)! 


TI 


1] 

L, = 
D(a 
W 
h 
al 
P 
th 


ORTHOGONAL POLYNOMIALS. 2038 


c. Analogues to Laguerre Polynomials, Formulas (18), (31), (36). 

show that the polynomial solutions of the difference equation 
(x + 2h) A?yn (x) 
h 
h h 
+ | Ayn (x) nyn(« +h) =0, p> 1, 
h 

are S-orthogonal on the interval [0, 0) with the weight function p~“; that 


they are given by the difference coefficient 
= 
h 


and that they satisfy the recurrent relation 


— r+ (n+ 2)p*+ (n+1 ) + (1 + 1)?p*yn (2) = 0. 


Also, formula (40) yields 
2 (n!)? h 
S = (n!)? ——_.. 
0 h 1 

Moreover, Theorem 6 states that the zeros of y,(x) are real, distinct, lie on the 

interval [0, 0) and separate those of y».(2). The first few polynomials are 


Bh(2p*+1) h?(11p% + Sp" + 2) 
p'—1 + (p*— 1)? r+ (p*—1)* 


When p = e the above polynomials are analogous to the Laguerre Polynomials. 
In fact, when yn(x) is multiplied by 1/n!, if we take the limit as h > 0, the 
above results all reduce to the corresponding relations for the Laguerre 


Polynomials. 
To obtain a simple illustration of these polynomials. let p=1, h= 1. 
then 
+ 2) — (x) + + 1) =0 
Yn (2) == 2-7) 

— [— a+ 8n + 4] (x) + (n (7) = 0. 


The first few polynomials are 


2 9 h h 9 

h p" p! 

1 3 


204 OTIS E. LANCASTER. 


Po = 1, P, =—3(x—1). P, = f(a? — 5x + 2), 
= — }(a* — 1227 + 292 — 6), 
P, = (rt — 222° + 13122 — 206r + 24),-- - 


d. Analogue to Jacobi polynomials. It follows from the general Case | 
treated above, that the polynomial solutions of the difference equation 
(1—ax— 2h) (1+ 2h) (2) 
h 
+ — (p+ 9+2)2 + (9p —p—q— 3) dgn(z) 
=0: (p> 
are S-orthogonal on the interval |— 1.1] with the weight function 
(1—ar)™., 
They are given by the difference coefticient 
h 
and they satisfy the recurrence relation 


Ppt 2) (n+ pt gt 
+ 
+n/2(2p+2¢+1)} 44) Gp+q4+1 
—n/2(p+q—3)} (2) 


+ 1) pt +4) 1)? + (n+ 1) (p+ 9) +793 
+ (pt 1) (qt 
+ pry +2 pt1) (q+ = 0. 


11. Limit ash—0O. It is natural to ask, * what happens to the above 
definitions and properties of the difference equations as / 0%” We shall 
show that they reduce to the analogous definitions and properties that have 
been developed for differential equations. 

a. The definition of S-orthogonal functions reduces in the limit. as 
h +0. to the integral definition of orthogonal functions, for the sum and 
the difference quotient reduce to the integral and the derivative respectively. 

bh. The definition of the L-adjoint difference equation reduces to the 
definition of the adjoint differential equation. For the difference equation 


(2) may be written as 


(2) + (a) = 0. 
h h 


al 


al 


where 


ORTHOGONAL POLYNOMIALS. 205 


J (2n —j) (Qn —7—1)-- (2Qn—/1+4+1) 
xv) = Px 
qu(x) j(@) (l1—j)! 
(7 = 0,1,2,- +, 2n) 
and the Z-adjoint equation (5) may be written as 
2n 2n-1 
A(qo(a — nh)v(x)) lu(a +h) 
h 
+ Aly: (x —n— 2h) v(x + 2h) ] 
(—1)? + nh)v(a + 2nh) =0; 


and each of these reduces in the limit, when the limits of the qi exist, to a 
differential equation, the second being the adjoint of the first. 
The difference equations (1) and (9) approach the differential equation 


(41) (ax? + br + c)y’”’ (x) + (de + f)y’(2) + =9 
and 
(42) + (7) + w(r)y(r) = 0, 


respectively. 
ce. The limit as / 30 of the multiplier (10) is the factor which makes 


(42) self-adjoint. 


Proof. 
q(x) \ 


log = log q(. t+ h) r(a+ h) 
== lin exp (s log (: h) 
(s! +h) 
q ( t- ~h) 
r(x +h) | , | 
1 (f ) 
= —— exp dx}. 
J 


Routh ?* has studied in some detail the polynomial solutions of the diff- 


erential equation (41). He shows that in general, the polynomial solutions. 


= lim exp(— log q(a)) - exp 


one for each characteristic value of A, possess five main properties: 


18. J. Routh, * On some properties of certain solutions of a differential equation 
of the second order,” Proceedings of the London Mathematical Society, vol. 16 (1884). 
pp. 245-261. See also, W. C. Brenke, “On polynomial solutions of a class of linear 
differential equations of the second order,” Bulletin of the American Mathematical 
Society (1930), pp. 77-84. Beale, “On the polynomials related to the differential equa- 
tion 
ldy _ ty + 
by + be? 


Annals of Mathematical Statistics, vol. 8 (1937), pp. 206-23. 


OTIS E. LANCASTER. 


1) They may be expressed as a differential coefficient. 

2) They satisfy a recurrent relation. 

3) They form an orthogonal sequence. 

4) Their zeros are all real and confined within certain limits. 


5) They possess a generating function. 


In the first nine sections of this paper we have shown that the polynomials 
defined by equation (1) possess three, and in some cases four, of the properties, 
We have not been able to show the existence of a generating function. We now 
show that the polynomial solutions of (1) and their properties reduce in the 
limit, as h 0, to the polynomial solutions of (41) and their properties that 


were discovered by Routh. 


1) The limit, as h 0, of the difference form of the polynomial is 


equal to 


li az’? + br +c¢—h(dr +f) 
a(r— nh )? + (b — dh) (a —nh) +c—hf 


1 
“N [ exp ( h log a(a—h)? h ) 


lr +f 
= (ax? + br +c) exp ( f ir) 
J 
exp log —h)? + b(a +e 
L 


d(a—nh) +f 
-exp ( S—log (1 — r 
( h log (3 a(a—nh)* + b(a—nh) + -) 
dr f 
= (ax? + br + c) ex} ( - iz) 
e ar” bx 


jn 
lim exp h )2 b(z— h) + c|* 1h) 


da hoo 


—f dx + f | 

lx +f 
= (ar? br c)exp ( iz) 
J ar+br--e 

dz + f 

[ ww? + br +c) exp (f an + da ) 


which is Routh’s second differential form for the polynomial solutions of (41). 


2) The values of A, B, C, and D in the recurrent relation (34) given 
by the expression (35), (36), (37), and (38) reduce, respectively, to the four 


sets of values 


| 
| 
- 
i 
i 


ORTHOGONAL POLYNOMIALS. 207 


A =— (n+ d/a)(2n+ d/a) 
B= (2n+ d/a)(2n+1+ d/a)(2n+ 2+ d/a) 


= — (d/2 — 2)(2n+1+d/a) + (b/2a)B 

bd — 2af\2 
A=b, B=d, C=2(n+1)b+f, D= ( |; 
A=1, B=1, C=f/d, D=—(n+1)c/ 

and 

A=—c, B=d, C=f,; D=(n+1)d 


The first three sets of values are the values obtained by Routh for the three 


cases he studied. 


3). The weight function ¢(2—h) of the S-orthogonal functions of (1), 


as shown above, reduce to 


ax? +- bx + ¢ SPS ax? +be+c° 


The interval of S-orthogonality is given by two zeros of 


— + 2h) + c| 


eXD L log 2h)? b(x 2h) 


and this approaches the function 


( f dx +f ) 
ax? + be + 


and its zeros give the interval of orthogonality for the polynomials of Routh. 


4). In the cases where zero theorems hold, the results reduce to the 
known results. as h +0. In other cases, the number of real zeros of the 
polynomials that lie on the interval of S-orthogonality increase as h decreases. 

In particular, the examples (a), (¢) and (d), have for their limit their 


corresponding analogues. 


UNIVERSITY OF MARYLAND. 


Is 
W 
1e 
s 
sr) 
| h 


ON AN EXTREMUM PROBLEM IN THE PLANE.* 
By GY. SZEKERES. 


The following problem was proposed by L. M. Blumenthal ': Let M be a 
set of n arbitrary points in the plane: what is the minimum (as JV varies) 


n 
of the greatest of the 3( 


) angles formed by the points of J/. 


If n = 3.4.5.6. this minimum is. as stated by Blumenthal (: — =) 
n 


attained in the case of the regular polygon. But P. Erdés has shown that 


there exist configurations of 7 points with every angle <(1 —3 Jt +e with 
e arbitrarily small. 
In the first part of this paper we show that for n = 2 there exist configu- 


rations of 2” points in the plane such that all the angles formed by them are 


(1 +e with ¢ arbitrarily small (Theorem 1). 


In the second part we show that among the angles formed by 2" + 1 


points of the plane there is at least one which is > “(1 —-+.- = 


n(2™+1)2)° 


which shows that our results are. in a sense, best possible. 


I. Let A > 2 be a sufficiently large number. In the plane of the complex 


numbers we define the set of points 1’5, P?;, Py, N = 2" —1, 
ie === () 
(1) \ further if & == 2% 4+ 2% + ---4+2, 


We prove that every angle formed by any three of these points is 
1 
< € —~—]r+ewheree>0 if Low. Leth then 
n 
where 


and pj is the greatest exponent occurring in only one of the representations 
(1) of & and k’. Hence 


* Received May 6, 1940; revised October 9, 1940. 
1L. M. Blumenthal, “ Metric methods in determinant theory,” American Journal of 
Mathematics, vol. 61 (1939), pp. 912-922. 


208 


it 


ON AN EXTREMUM PROBLEM IN THE PLANE. 209 


Py Px 


= wt 
where 
Thus the direction of P;,— Px is arbitrarily near to + e/")™ if only A is 
sufficiently large. We shall call 8,,e/""' the approximate direction of 
Py — Px; in symbols 
D(k’ —k) = 
Now let k’ ~ and 
D(k’ —k) = 
(2) D(k” —k) = 
—k’) = 


Then p’~p» and =; for, if the representation (1) of & contains the 
exponent p, then that of k’ and k” cannot contain it; on the other hand, if p is 
not contained in the representation (1) of &, the k’ and k” must both contain 
it. In either case we have p’ ~ p. 

Now suppose 6 = — 8’; then the vectors Py — Py, and Px — Px are in 
(approximately) opposite direction, and we would have D(k’ —k’) = 
D(k’’ —k) against our former assertion. Hence 6=®& in (2). 

But obviously the approximate angle formed by any two Px — P; and 
Py — Py is 1 —j/n, 7 = 0, 1,- -n and here the case 7 0 is excluded by 
the above considerations. This completes the proof of Theorem 1. To prove 
the second half of our theorem we put forward some considerations in the 
theory of graphs. We call a system of points Qi, Q2- - - and edges QiQ; con- 
necting these points a graph. We call a system of edges Q,Q2,. Q2Qs° °° Qi-1Qi 
apath. If Q, = Qi we have a closed path. If every two of the points Q; and 
Q; are connected the graph is called complete. The proof of Theorem 2 is 
based on the following 


Lemma: Let G be a complete graph of N > 2" points; then it cannot be 
the union of n graphs 91, go,* * * gn such that in every g all closed paths con- 
tain an even number of edges. 

Proof. We use complete induction. The Lemma evidently holds for 
n==1. Assume that it is true for n —1 and that 


G=gitget 


where in each gj; all the closed paths have an even number of vertices. It is 


well known * that we can divide the vertices of g; into two classes A and B such 


* Dénes Konig, Theorie der endlichen und unendlichen Graphen, p. 170. 


14 


210 GY. SZEKERES. 


that every edge of g, connects a point of A with a point of B. Hence we can 
also divide the vertices of G into two disjoint classes A’ and B’ such that every 
edge of g, connects a point of A’ with a point of B’. We can assume that the 


N 
number of points of A’ is not less than > > 2", But then 


Jz + gn 


would contain a complete graph (consisting of all joins of pairs of points of 
A’) the number of points of which is greater than 2", which contradicts the 
induction hypothesis; this completes the proof. 

Now we prove Theorem 2. Let NV > 2" points P,, Pn be given in 
the plane. Connect any two of them; there evidently exists a line O such that 
a gles formec and any 1e lines are erically greater 

ll the angles formed by O and of the li P,P; are numerically greater 
than = Denote by K,“ the set of oriented lines in the plane whose angle 
with O taken positively lies between 


N? /N 
/ and 


at (r—1) +r (inclusive), r= 1,2,°--n. 


and similarly by K,‘) the set of lines whose angle with O lies between 


+53) + (r and tpt 


(inclusive). 


Denote now by g; the graph formed by the lines P;P; directed from Pj; to P; 

(K-™ + K,@)), Evidently the union of 9,,92,° gn contains the com- 
plete graph formed by our N points; thus by our Lemma at least one of the 
graphs g say gi contains a closed path with an odd number of vertices P,, 
+ + Pors;. But then there exist three consecutive vertices, say Pi, Pi, 


such that both lines P;_,P; and lie either in K,™) or in K,®). 


In either case the angle P;.P:Pi., is not less than -(1 —t4+—5 i) which 


completes the proof. 

We can ask similarly the minimum of the greatest of the angles formed 
by N points in three dimensional space. Our method does not given the exact 
answer in this case. We can only prove that, on the one hand among 2"+ 1 

points there are always three which form an angle > (1 ~~) on thé 
n 
other hand there exist 2” points such that the maximum angle is 


C2 
< where c, and are constants. 
n 


59 BRENAN Roap, SHANGHAI. 


EXPLICIT BOUNDS FOR SOME FUNCTIONS OF PRIME NUMBERS.* 


By BarkLey Rosser. 


Summary of results. Counting 2 as the first prime, we denote by r(z), 
p(n), and 6(x), respectively, the number of primes less than or equal to z, 
the n-th prime, and the logarithm of the product of all primes less than or 
equal to x. It is known that for each positive constant A, there is a constant 
NV for which the three following statements hold true. 

If N =2, then 


If N= n, then 
nlogn + n log logn—n—An < p(n) < nlogn + n log logn—n-+ An. 


If n, then 


A 


Moreover, these three statements are essentially interdeducible, in the 


sense that if one of them can be proved for a certain value, A*, of A, then 
the other two can be proved with the value A* + ¢ for A, where e depends on J. 

Heretofore the question of determining the NV which goes with a particular 
A has received no attention. Moreover, the question of how small A can be 
taken without requiring that N become large has been neglected. 

Theorem 22 of this paper furnishes an explicit answer (not the best 
possible) to the first question. For the second question the answer A = 3 
is given. Again this is not the best possible, and a partial proof is given 


that we can take A =1. In particular, for A = 3, we have: 
THEOREM 29. Jf 55 =a, then: 


< 


log — 4° 


B. logr—4r < p([a]) < rlog2 + log log + 2z. 


~ 


* Received June 28, 1940. 


211 


212 BARKLEY ROSSER. 


For A = 1, we have the partial result embodied in the following seven 


theorems. 


THEOREM 30. If =z, then: 


log x r—2° 


B. zrlogz+ zlog log r— 2x < p([r]) < clog logz. 


1 é 1 
C. < (1 


THEOREM 26. Jf 17S then < (zx). 
log x 
THEOREM 25. If e << then r(x) < 
og — 2 


THEOREM 27. If 1<nSe'”, then nlogn-+n log log n — 2n < p(n). 


THtorEeM 28. If 6S nS then p(n) < nlogn + n log log n. 


THEOREM 23. If 41S then (: — )e < 


log x 


THEOREM 24. If 1< then < (: 
og 


For 0 < x= 1,000,000, these results were derived from the very sharp 


theorem: 
If 0< x= 1,000,000, then 2 — 2.7823 < <2; 


which was proved essentially by comparison with Lehmer’s “ List of Prime 
Numbers.” For 1,000,000 = 2x, analytical methods were used. It is re- 
markable that in the ranges 1,000,000 = e® and = the analytical 
methods enable one to take A =—1, whereas in the intermediary range 


e® = x = e*° one has to be satisfied with a larger value of A. This peculiar 


situation is apparently due to the insufficiency of our present knowledge about 
the zeros of the Riemann zeta function. A significant increase in our present 
knowledge would undoubtedly enable one to take A —1 in the intermediary 
range also. For instance, if B+ ty is a typical complex zero of the zeta 
function, it would suffice to know that 


6 log y 


for 1400 = y. 


\ 
t 
a 
n 
p 
fi 


EXPLICIT BOUNDS FOR SOME FUNCTIONS OF PRIME NUMBERS. 213 


References to the bibliography will consist of a number referring to the 
numbered bibliography at the end of the paper followed by the necessary 
page references. 

Tables of y(x), 0(z), and p(n). We define 

= log p. 


p™S2 


The following relations connect and y(z): 


[x] 
= = 


(2) = (2) — — ye) + 


Gram has given a table of (x) to eight decimal places for 1<2< 2000 
(see 1, pp. 281-288). As a check on Gram’s table, the present author com- 
puted y(x) for 1 = 7 S 2000 by use of a six place table of natural logarithms 
and found no discrepancies outside the limits of accuracy of the computation. 
By comparing the first few entries of Gram’s table with values computed from 
eight place, nine place, ten place, and eleven place values of the natural 
jogarithms of the primes, it was apparent that Gram must have used eight 
place values in computing his table. Hence the eighth place in Gram’s table 
is not reliable. 

By use of (2) and Gram’s table, one can readily compute @(z) for 
1= a2 2000. In order to facilitate the computation of 6(2) for z S 10,000, 
Table I was prepared (see end of paper). Table I is auxiliary to a table of 
Jones (2, pp. 114-117. Note two errors: log 6899 = 8.839132 and log 7853 
= 8.968651). Jones’s table gives the natural logarithms to six decimal 
places of the primes from 2000 to 10,000. To compute 6(2), one can take 
the nearest entry below x in Table I, and add the logarithms of the inter- 
vening primes as given in Jones’s table. As a check, one can take the next 
entry above w, and subtract the logarithms of the intervening primes. So that 
this check will come out exact, Table I was computed from Jones’s table, and 
all six places were retained, though the last is quite unreliable. Table I was 
checked by adding up seven place common logarithms of the primes, and 
multiplying the sums by log, 10. 

Lehmer’s List of Prime Numbers (3, pp. 1-135) is a tabulation of 
p(n —1) as a function of n for 1= nS 675,000 (Lehmer takes 1 as the 
first prime). From this table, one can read off r(z)-+- 1 for 1S 7 10,006,721. 

Relations between r(x), p(n), and @(2). Obvious relations are 


214 BARKLEY ROSSER. 


(3) =n 

(4) p(x(z)) Sz 

(5) 6(p(n)) = Slog p(r). 
r=1 


By use of a theorem connecting sums with integrals (4, Theorem 4, 
p. 18), we get 


(6) = loge — 
(7) 


From these we now proceed to deduce relations which hold between 
functions which bound x(x), p(n), and 6(z). In constructing bounding 


functions for z(z), the logarithmic integral, 


i-n dy dy 
log y i... log y 


is very convenient to use, so we call attention to a few tables of li() and the 


related function 
Ei(x) =li(e*). 


The most complete table of Li(z) is Table VII in Vol. 1 of the Mathematical 
Tables of the British Association for the Advancement of Science. A short 
table of Hi(xz) is given in Chapter VI of Funktionentafeln, by Jahnke and 


Emde. Lehmer gives a table of li(x) correct to the nearest integer (3, pp. 


xiil-xvi. The column headed “ Tchebycheff ” contains vaules of li(x), and 


2 dy a 
not values of “— as Lehmer claims). 
2 logy 


Proof. The expression on the right has the value 2.008, whereas, by (7). 


the expression on the left has the value 1.773. 


A similar proof using (6) holds for the next two lemmas. 


e 
LEMMA 2. > 4hli i(e* imax 902 4li(e?). 


LeMMA 3. < 3li(e*) — 


d 
to 


EXPLICIT BOUNDS FOR SOME FUNCTIONS OF PRIME NUMBERS. 215 

Lemma 4, If < @ for e4* K, then x(x) < li(x) for 
K. 

Proof. Under the hypothesis of the theorem we can deduce from (7) 


and Lemma 1 that, if e?*=2= K, then 


However, integration by parts gives 


f ydy _dy 
log y log y’ 


from which the lemma follows. 


For the next two lemmas, the proof is similar and makes use of the fact 


that integration by parts gives 


li(z)dz 
yo 2° 


LemMA 5. Jf li(x) —li(a?) < x(x) for aK, then 


and that if y = 2° 


O(x) < r— + log (x(x) —li(x) + li(z4)) 


LemMa 6. Jf x(x) <li(x) for PBS a= K, then 


«— log x(li(x) < O(a) 


for K. 


Corottary. Jf —li(a?) << <li(x) for Kk, then 
t—logali(x?) << 


LEMMA 7. For 2, i(x) << 
log x 2 


Proof. For « = e*, the inequality is true, and for e* S 2, 


d “4 d 
dz hi (a ) dz logz—2 
LemMA 8 For < li(x) —li(a*). 


log x 


Similar proof. 


= 


216 BARKLEY ROSSER. 


LemMA 9. For n= 5, 0(p(n)) > nlogn-+ nlog log n—n—Ili(n). 
Proof. Since p(r) >rlogr (5, Theorem 1, p. 37), we have by (5), 


6(p(n)) = 6(p(5)) + Blog (r log r) 


= 0(p(5)) + j log x dx +f log log x dx 
= 6(p(5)) + n logn—n—5logi +5 
+ n log log n — 5 log log 5 + li(5) — li(n) 
> nlogn + n log log n —n —li(n). 
Lemma 10. If 16 and if p(r) <rlogr+rloglogr for 16S r <n, 
then 


n log log n 


6(p(n)) < nlogn + nlog logn—n + 


Proof. Since p(n) < nlogn + 2n log logn for n=3 (5, Theorem 2, 


p. 40), we have 


-1 
6(p(n)) < 6(p(15)) log(rlog r + rlog log r) + log(n log n + 2n log log n) 
7r=16 


< 6(p(15)) + x dx + log (log x +- log log x) dx 
16 16 


+ log(n log n + 2n log log n) 
< 6(p(15)) + n log n — n — 16 log 16 + 16 + n log (log n + log log n) 


(log + 1)dz 
o 
16 log(log 16 + log log 16) + 
+ log(n log n + 2n log log n). 


However 


log log n 


log (log n + log log n) = log log n + log (: oe Ee”) < log log n + 


log n log n 
and 
. (log x + 1)dz > n dz 
16 log x(log x + log log x) 16 2 log x’ 
and 


6(p(15)) — 16 log 16 + 16 — 16 log(log 16 + log log 16) 
—4(li(n) —li(16) ) + log(n log n + 2n log log n) < 0. 


The range 0 < x S 1,000,000. 
THeroreM 1. For 1,000,000, li(x) —li(r4) < x(z). 


Proof. The interva] 0 < z= 1,000,000 was divided into convenient sub- 


( 


b- 


EXPLICIT BOUNDS FOR SOME FUNCTIONS OF PRIME NUMBERS. 217 


intervals. Corresponding to each subinterval, J,, a linear function, Az + B, 
was determined so that 


li(a) —li(at) < Ax +B 


for in I,. Then by comparison with Lehmer’s List of Prime Numbers, 
it was determined that Ax + B+1<2(x) +1 for c in J,. To show how 
this can be done quickly, we exhibit a specimen of the computation. By 
getting the equation of the tangent to the curve 


y = li(x) —li(a4) 
at « =e", and by noting that the curve is convex upward, we ascertain that 


4,323. 
< 4,823.1 + 13.41651 


for 2 > 2. So we undertake to show that 


in 7, Put «= e+ + 13.41651 y=a- by. Then it suffices to show that 


(8) 53,518 + by) +1 


for a+ by in I, Now, using a Monroe High Speed Adding Calculator 
(or any computing machine of similar construction), put 53,518 on the upper 
dials, put a (that is, 660,003.22477) on the lower dials, and 6b (that is, 
13.41651) on the keyboard. Now if one holds the + bar down for y revolu- 
tions, 53.518 + y will appear on the upper dials and a -+ by will appear on 
the lower dials. In other words, we have now set the machine so that it will 
readily give a+ by as a function of 53,518-+ y. Now take 53,518 + y 
= 53,550 and compute a+ by. Then we see by the list of primes that 


p(53,518 + y + 50—1) ca-+ by, 
so that 

53,518 + 4+ 50 < by) +1, 
and so (8) holds for 53,550 S 53,518 + y S 53,600. Put 53,518 + y 
= 53,600, and then 

p(58,518 + y + 50—1) <a-+ by, 


so that (8) holds for 53,600 S 53,518 + y S 53,650. And soon. In general, 
for x in the neighborhood of e’**, one can advance y by 50 at atime. Of 
course irregularities in the distribution of the primes cause trouble occa- 
sionally. For instance, if 53,518 + y = 53,800 then r(a + by) + 1 = 538,848, 
so it would appear that one could only advance y by 48. However 


218 BARKLEY ROSSER. 


p(53,850 —1) only exceeds a+ by by eleven, so that one can readily see 
that if one increased y by 48, then a + by would be enough larger to justify 
advancing y by 2 more. 

For x in other neighborhoods, one would advance y uniformly by some 
amount different from 50. Three factors determine the choice of this amount: 

I. Itshould be small enough so that exceptional cases occur infrequently. 

II. For speed in operating the machine, it should be a multiple of 10 
if possible. 

III. For convenience in locating entries in the list of primes, it should 
be a factor or multiple of 100. 

THEOREM 2. For 0 << 1,000,000, < 2. 

Proof. Clearly it suffices to prove 6(p(n)) < p(n) for p(n) = 1,000,000, 
As in the proof of Theorem 1, we divide the interval 0 < x = 1,000,000 into 
subintervals. So suppose we wish to prove 6(p(n+h)) < p(n+h) for 
nSn+hTN. If p(n) = 10,000, compute 6(p(n)). If p(n) > 10,000, 
use Theorem 1 and Lemma 5 to compute a & such that 6(p(n)) << hk. Then 

O(p(n+th)) <k+hlogN. 
So it suffices to prove that 
k+hlogN < p(n+h). 
That is, it suffices to prove that 
m(k-+-hlogN) +1<n+1-+h. 

Put c =k + h log N, and we have the problem reduced to that of comparing 
a(x) +1 with Ax + B+1 in a given range, and appropriate modifications 
of the technique described in the proof of Theorem 1 will work. 


1 
For 1< 1,000,000, 6(2) < (1 + i 
og 


THEOREM 3. For 2S 1,000,000, < li(z). 


Proof. For e**, compare values of w(x) and li(x). For 


e?4 = = 1,000,000 use Theorem 2 and Lemma 4. 


TueoreM 4, For {= 2 = 1,000,000, —log a li(a*) < 6(2). 


Proof. For compare values. For e® =a = 1,000,000, use 


Lemma 6, Cor. 


10. 


int 


EXPLICIT BOUNDS FOR SOME FUNCTIONS OF PRIME NUMBERS. 219 


Remarks. If 10,000 < # = 1,000,000, and one wishes more exact bounds 
for (x) than those given in Theorem 2 and Theorem 4, they can be obtained 
from the List of Prime Numbers with the aid of Lemma 5 and Lemma 6. 


It seems likely that Theorems 1-4 remain true if the upper limit of 
1,000,000 is replaced by an upper limit of 10,000,000. However it is known 
that all four theorems are false if one tries to replace the upper limit by 
infinity (4, Chapter V, especially Theorem 34 and Theorem 35). 


THEOREM 5. For 0< 71420 and for 1423 S 10,000, 


2x3 < A(x). 


Proof. Compare values. 

Remark. For « = 1421 and « = 1422, 6(x) < 22%. 
6. For 0 << x= 10,000, 202580723 < 6(2). 
THEOREM 7%. For 0 << «= 1,000,000, «— 2.7823 < 6(z). 


Proof. By Theorem 4 and Theorem 6, one only needs to prove that 
log x li(z#) S 2.782* for 10,000 = 2, and this is not difficult. 


A slight additional computation allows us to infer: 


) 


CoroLuary. For 41S 2 = 1,000,000, (1 
log x 


THEOREM 8. For 17 Sz = 1,000,000, —— < 
logzx 


‘ 


Proof. For 17 =x e+, compare values. For e* = 2= 1,000,000, use 


Theorem 1 and Lemma 8. 


THEOREM 9. For e? << 1,000,000, r(x) < 


Similar proof using Theorem 3 and Lemma 7. 

THEOREM 10. For 1< 2 83,498, n log n + n log log n—2n < p(n). 

Proof. For 1< n= 1480, log log n < 2, so that the theorem follows 
from the fact that p(n) >nlogn (5, Theorem 1, p. 37). For 1480S » 
S 83,498, 6(p(n)) < p(n) by Theorem 2 and li(n) <n, so that the result 
follows by Lemma 9. 

THeoreM 11. Jf 6S nS 83,498, p(n) < nlogn + n log log n. 

Proof. For 6=n =e, compare values. For e® = nS 83,498, we use 
induction on n. Assume p(r) <rlogr+rloglogr for 6Sr<cn. Then 


log r—2 


29() BARKLEY ROSSER. 


we wish to prove that p(n) <nlogn-+nloglogn. So suppose p(n) 
=nlogn+ nloglogn. Then 


6(n log n + n log log n) S 6(p(n)). 


So by Theorem 7 and Lemma 10, 


n log n + n log log n — 2.78(n log n + n log log n)4 
n log log n 
< nlogn + nlog log n —n + Bas 
log n 
That is, 
s n log log n 
n < 2.78(n log n + n log log + 
log n 

which is false if n = e’. 


Bounds for y(z). Define 
+ log 2x + $ log(1 — 1/2”). 


m h) 
fm,n,a(@, h, — + — zh", 


In the next five lemmas we assume « > 1 and r+ mh > 1 so that the 
functions and integrals discussed will all exist and take only real vaiues. 


LemMMA 11. Obviously 


h 
fm,r.a(Z; h, z)dz => K h). 


0 


LEMMA 12. Obviously 


fm, n, a( h, Yt Yo +- + Yn) )dyn n-1 h, + Y2 + + Yn-1) 
LEMMA 13. Km(2,h) 
h h h 
dy, dy: fm,n.a(2;, h, + + + Yn) dyn. 
0 0 UV 


Proof by induction on n, using Lemmas 11-12. 


Define 
fu(z, h, z) == fami (2, h, 2). 


Lemma 14. Jf 0<h, then there is a z such that 02S mh and 
+ 2) S fm(2, h, z). 


Proof. Suppose that z) > fm(x,h,z) for 0 mh. Then 


He 


Th 


pre 


al 
|_| 


n-1 


EXPLICIT BOUNDS FOR SOME FUNCTIONS OF PRIME NUMBERS. 


ah 


h 
0 0 0 


However the former equals Ky (a,h) by definition, and the latter equals 
Km(a,h) by Lemma 13. 


“h < 0, then there is a z such that mh = z= 0 and 


LEMMA 1a. 
= = fm(2, h; 2). 
=2z=0. Then 


Proof. Suppose that hm(2, h.z) for mh= 


0 0 0 
dy; f f + t+ yz + ym) dym 


*0 0 0 
h h h 


However the former equals (—1)"K»m(2,h) by definition, and the latter 


equals (—1)”"An(2,h) by Lemma 13. 


and 


(x—1)/rm, 


THEOREM 12. 
v6) més (x; 78) 


m(a, — #8) 
then 
r(1—e,) — log 27 — log(1 — 1/2”) Sy 


= r(1 + &) — log 2x — $ log(1 — 1/2”). 


Use Lemma 14, and put in the definitions of (.r) 


Proof. Put h = 8z. 
Hence there is a z such that 0S 2S mh, and 


and fm(a, h, z). 

1 Km(ah) , mh 

+ z) — (a+ 2) + log 2x7 + $ log (1 2. 


Replace h by «8, and one has 
1 
Sa(1+«) —log2x— J log (1 
( ( 5 2 > (x + z)? 


0=z, so that Sy(x#+2z) and 


1 
log (1 is 53) = — $ log(1 — 1/2”). 


To prove the other half, put h = — 28, and 


However, 


This proves half the theorem. 


proceed similarly, using Lemma 15. 
Henceforth, we denote non-trivial zeros (4, p. 58) of £(s) by p= B + iy. 


222 BARKLEY ROSSER. 


h 
THEOREM 13. + z)dz= — (a + 
0 


p(p +1) 1) 
Proof. It is known (6, p. 317) that 
= log 2z, 
and also (4, p. 30, p. 73) that 
lu = —— — 


The theorem follows from these and the fact that =I e | = is convergent 


t, Theorem 18, p. 57). This latter fact enables us te integrate the result 
( >] g 
of Theorem 13 term by term and deduce Theorem 14. 


THEOREM 14. Ky(z,+ 28) = 


germ m 


{ 


“(p+ m) 


Define 
(9) K = K(m,r) => 
THEOREM 15. Jf 068, then ‘ 
| Km(a@, + << + + 1)"K. 


Proof. Take absolute values of both sides of Theorem 14. 


+1)-- 
1, Theorem 16, p. 48). Also 


< 


re + 7§)erm | (1 + j8) < (1 + 


>> (— 1 ) | <= (" (1 + +1 
j=0 


<8 ("Jia 4 
—3(") 
= ((1 +8)" 41)”, 


THEOREM 16. With K as in (9) and «, and e& as in Theorem 12, 
if one takes 


| 
( 
So 

1) 


1% 


EXPLICIT BOUNDS FOR SOME FUNCTIONS OF PRIME NUMBERS. 223 


m+1 m 


then < PO and < 6. 
Proof by use of Theorem 15. 


We now derive an upper bound for K. First we need information about 
the zeros of £(s). Let N(7’) denote the number of p’s for which 0 << yST. 
Define 


ty 


= 


R(T) = 0.137 log T + 0.448 log log T + 1.588. 
TneEoreM 17. For 0= TS 280, | —F(T)| <1 


Proof. This can be deduced from the computations of Hutchinson 
(7. pp. 49-60). 
Choose A so that F(A) =1041. Then A = 1467.47747 correct to five 


decimal places. 


18. For0 <<TSA, |N(T)—F(T)| <2; N(A) = F(A) 
= 1041; and for0 <y=A, B=}. 

Proof. The computations of Titchmarsh (8, pp. 234-250, and 9, pp. 
261-263) are almost adequate to give these results, the missing computation 
heing that which proves that N(A) < 1043. This computation has been 
performed by the present author, using a variation of a method of Titchmarsh 


(8, pp. 251-252) 

19. For 2ST, | N(T)—F(T)| < R(T). 

Proof. For 2=T=A, the theorem follows from Theorems 17-18. 
For A= 7’. apply the method of Backlund (10, pp. 354-375) to (&(s))* 
instead of to £(s). For instance, the first part of Backlund’s discussion (10, 
pp. 354-361) when applied to (¢(s))* yields 

|N(T) log | db 


ent ‘ 
" 

f where i: is a constant independent of V, r= 1.32, 


BARKLEY ROSSER. 


log = 0.565314, 


G(s) ((E(s + + + +Z—Ti))*}, 

From Backlund’s bounds for | £(s)| (10, pp. 361-369) and the fact that 


4 =T, one can deduce upper bounds for | G(ret%)|. To deduce an upper 
bound for — log | G(0)|, note that 


G(0) = (£(5/4 i7)%). 
Hence we can choose a succession of N’s tending to infinity so that 


G(0) 
| 


Lim 


However (5, p. 26), 


|£(5/4+1iT)|= 
All other steps in the proof are strictly analogous to the steps of Backlund’s 
proof. 
1 
THEOREM 20. For ASy, 1 
= B <1 — y 
Proof. Use the procedure outlined by Landau (6, pp. 318-324) using 


the numerical evaluations of Rosser (5, pp. 30-31) and the fact that 


18 + 30 cos d + 17 cos 26 + 6 cos 36 + cos 4h 
= 2(1+ cos ¢)*(1 + 2 cos = 0. 


Having no further use for the old meaning of ¢(2), we now define 


y-1/(17.72 log y) 


= $(m, x, y) 


LEMMA 16. With K as in (9) 


1 
KSa | .,m+1 | 
A<y 
Proof. By definition, 
== 
ea ly™*| | 


However 


224 
1 1 


EXPLICIT BOUNDS FOR SOME FUNCTIONS OF PRIME NUMBERS. 


Since B — ty is a zero of €(s) if B+ ty is (4, Theorem 16, p. 48), 


psaly™* 
Y>0 
However, by Theorem 20, 


So 


$(y). 
; 


m+1 | < 2 2 

> 

By Theorem 18, if 8B >4 and y>0, then y>A. Also 1—B-+ ty is a 
zero of £(s) if B-+ ty is (4, Theorem 18, p. 48). So 


2 > o(y) = oy). 
B>s Y>A 


LEMMA 17. > < 0.0463;. & < 0.00167; < 0.0000744. 
Proof. The first result has been proved before (5, pp. 28-30). From it 


we deduce the second result as follows. Compute 


si 


0<y=50 


from Gram’s values of y (11, p. 297). Hence deduce an upper bound for 


50<7 
Dividing by 52.970 (7, p. 59) gives an upper bound for 


1 
50<7 Y 
The second result now follows readily. For the third result, note that Gram 


(11, p. 293) gives the result 


+ — 0000 1858 6299 6426---, 
p 


where a = y + (@—4)i. Remembering that B = 3 if 0 << y S A, we readily 


deduce the third result. 
LemMA 18. If log 942(m +1), then 


225 
‘ |_| 
p-1 
13 


226 BARKLEY ROSSER. 
Proof. By a theorem connecting sums with integrals (4, Theorem A, 
p. 18), 
—— 
JA 
However for 4 Sy, 
log  S 942(m +1) < 17.72(m + 1)log? A S 17.72(m + 1) log? y, 
and so ¢’(y) <0. Hence by Theorem 19 
fe 
<— FW) + dy 
A<Y 


Integrating by parts gives 


< +B + RAO). 


However 
3.47 
1 y 
F’(y) $(y) dy = 5 4 log $(y) dy, 
A 
and 
00 © d(y)dy p(y) dy 
R’(y) dy 0.137 y ylogy 
0.137 log o(y)dy+ 0.44 3 f log $(y) dy. 
A log — A log A log =~ 
~ Qe 


Lemma 19. If 
loga< 
i m + 0.184 
then 
1 + 5.454m 
m +- 0.18: 


1— —log 
( 942m° 


Proof. Integrating the left hand side by parts, using 


log (y)dy 4 


»1/130 


log 


aT 
as the part to be integrated, we get 


1 + m log 


= 
4 og y mypl/17.72 log A 


(m i/log) log x 
+- f log p(y) dy. 


2 2 
A 17.72m? log? y 2 


So 


EXPLICIT BOUNDS FOR SOME FUNCTIONS OF PRIME NUMBERS. 227 


A 
m-+1/log—logr 
00 y 1 + 5.454m 
d 
f, log on o(y)dy < m2 A myi/130 17.72m? log? A f, 8 on dy. 


oO 
Solving for f log # $(y)dy proves the theorem. 
JA aT 


THEOREM 21. If 


m + 0.184 
3.47 0.869m + 0.160 
=24 atk 


log a ) m2Amqi/130 


942m? 


and 1 + mea < a, then for a2, 
x(1—e) —1.84 < y(xv) < e(1 + €) — log (1— 1/2’). 


Proof. Since K(m,x) = K(m,a) if a=~2, the theorem follows from 
Theorem 12, Theorem 16, Lemma 16, Lemma 18, and Lemma 19. 

By use of this theorem, Table II was computed. In those cases where 
m=4 or m =5, x was large enough so that one could use 0.0000744 (see 


Lemma 17) as an upper bound for 
1 
Cand 
without appreciably affecting the value of «. 
By a method of Rosser (5, pp. 39-40 and Lemma 10, p. 31) one can prove 


Theorem 22. 
THEOREM 22. Jf = (log 79) and = 2, then 
< (1+ (2) )a 


Remarks. From (2), Theorem 2, Theorem 7, Table II, Theorem 22, 
and Gram’s table of y(a), one can deduce: 
Por i<2= and for = 2; 


1 


For 1 <a, < u(t) <(1+ 


log log x 


6) 


BARKLEY ROSSER. 


A curious fact is that ¥(x)/x takes its maximum at x = 113, so that 
one can say that 
w(x) < 1.0388212 
for all positive z. 
In an earlier draft of this paper, a much more complicated, but more 
accurate, bound for K (depending on an improvement of Theorem 20) was 


used to deduce the result: 


For 1 < a, 


However the computations supporting this result were quite extensive 


and have never been checked. It is the author’s hope to eventually improve 


the upper bound for K so that one can prove: 


For e® < z, 
0.95 0.95 


If this could be done, one could replace the upper bounds on zx and n in 


Theorems 23-28 by infinity. 
The range 1,000,000 = 2. By (2) and Table II we may infer that: 


(10) For = 2, (1—0.0393)x < f(r) < (1 + 0.0376) z. 
(11) For e® S27, (1— 0.0328)2 < 6(x) < (1 + 0.0321)z 
(12) For e <2, (1—0.02)a < < (1+ 0.0199) 


For e*° = x, we may take Table II and Theorem 22 as referring to 6(z) 
as well as to w(x). Hence, with the help of Theorem 2, Corollary and 
Theorem 7, Corollary one can readily deduce Theorems 23, 24, 29C, and 
30C. Similarly, one can deduce Lemmas 20-21. 


LemMA 20. For 22, 


2.85 2.8% 
log x log x 


LEMMA 21. For 


0.96 0.96 
< (2) < 
Remarks. Because of Theorem 2 and (10), we may infer that 6(z) 


< 1.0376z for all positive x. Hence 
II p < (2.83)? 


|_| 
th 
re 
( 
0 
i 
h 
( 
| 


EXPLICIT BOUNDS FOR SOME FUNCTIONS OF PRIME NUMBERS. 229 


for positive x. Because of Theorem 5, Theorem 7, and (10), we may infer 
that 0.96072 < @0(x) for 2600 =z. Computation and Theorem 6 gives the 
result that 0.69322 < 6(x) for 292. Hence 

(2.61)* << [] p for 2600 =z. 


psx 


2°<I[[p for 29=2. 


pSe 
In order to derive Theorems 25, 26, 29A, and 380A, we proceed as fo"lcws. 
First note that by (7) and Theorem 24, 


13 x x dy 
(13) < ' log?x Jo log? y 
for 2 = Since 
dy y d 
f 
log® y 2log?y J 2log*y 
and 
li(y) 
log? y log y Y); 


one can compute the value of the right side of (13) for any x. In particular 
one can show that for e?®-8 < z, the right side of (18) is less than x/ (log — 2). 
Theorem 25 now follows by use of Theorem 9. 

By (7), 
logx Jai ylog’?y 


a(x) > 


f41<= 7. So for 414S27S e'™, 


x dy 
n(x) > — 
log*x log? y 41 


by Theorem 23. For e*** =z, the right side of this is greater than x/log a. 
So Theorem 26 follows by use of Theorem 8. 


Now assume e?° = 2, Then by (7) and Lemma 20 


(14) x f dy 42.85 dy 


logz log? log? y 2 loge 


dy 2 f- my 
2 log? loo? ve log® log? y 


dy x ‘ dy 
3 : 
f log® y S 2 log*y’ 


However 


and 


| 


230 BARKLEY ROSSER. 


From these inequalities, one can easily show that the right side of (14) is 
less than 2/(log «— 4), and so prove one half of Theorem 29A. The proof 
of the other half is similar. 

Now assume e?°° = 2, Then by (7) and Lemmas 20-21, 


0.962 49 _ dy 


This is treated in the same manner that (14) was, and one half of Theorem 
30A is proved thereby. The other half is proved in a similar manner. 
We now consider Theorem 27, Theorem 29B, and Theorem 30B. To 


prove the last two, we will need to prove the results 
(n + 1)log(n +1)+(n + 1)log log(n + 1)—(n + 1)— A(n+ 1) < p(n) 


for A=3 and A=1 respectively. Let a p(n), and suppose that for 
asx, 6(x) < (1 Let 80,000 = n. Now suppose that 


p(n) S (n+ 1) log (n+ 1) + (n+ 1) log log (n + 1) —2(n +1). 
Since 80,000 = n, we can infer 
p(n) < nlog n + n log log ii — 2n + 2 log n. 
Then by Lemma 9, 


nlogn + n log log n 
< (1+) (nlogn + n log log n — 2n + 2 log n). 
That is 
o 
(15) 1+ 2< + e(log n + log log n). 


n 


Now if a=e'** and aS p(n), then 82,395 n. Also, by (10), we can 
take « = 0.0376. Then (15) is false for If a= we can take 
0.0199 by (12). Then (15) is false for nSe*. If a—e*, we can 
take «0.0119 by Table I]. Then (15) is false for nS e*°. And so on 
up to ne)”, This with Theorem 10 proves Theorem 27, and a similar 
procedure proves one half of Theorem 29B and Theorem 30B. 


We now consider Theorem 28. Suppose aS p(n), and that for a2, 
(1 —e)e < 6(x). Also suppose that p(r) < rlogr + rloglogr for 


16=r <n and that nlogn+nloglognS p(n). Then by Lemma 10, 


oof 


‘m 


or 


EXPLICIT BOUNDS FOR SOME FUNCTIONS OF PRIME NUMBERS. 


(1—e)(nlog n + n log logn) < nlogn + n log log n —n + 
That is, 


LEMMA 22. 


Proof. 


2000 
2500 
3000 
3500 
4000 
4500 
5000 
5500 
6000 


Against values of 6 are 


6(p(n)) < nlog n+ n log log n—n + 


< —— + «(log n + log log n). 


n log log n 
logn 


For 83,000 = n S e®°, this is false, and so Theorem 28 follows. 


2n log log n 


log n 


+ log(n log n + 2n log log n). 


From here, proceed as in the proof of Lemma 10. 
By use of Lemma 22, we can readily prove the rest of Theorem 29B and 
Theorem 30B. 


6408. 
6920. 
7364. 
7875. 
8343. 
8870. 
9418. 
9895. 


Since p(n) < nlogn + 2n log logn (5, Theorem 2, p. 40), 


n-1 
6(p(n)) < 0(p(15)) +5 log (r log r + 2r log log r) 
r=16 


907 
421 
857 ¢ 
150 
999 
374 
368 
991 


tabulated those values of « such that one can deduce 
from Theorem 21 that (l—e)a<y(v) < (14+ for e? Sz. 
value of ¢ that can be deduced from Theorem 21 depends on a preassigned 


value of m, the values of m which were used are tabulated with b and e. 


231 
Co 
] 
o 
Co 
) 
TABLE I, 
(2) 

1939.839 200 6500 67 

2433.602 748 7000 )31 

2932 .359 205 7500 18 

3409 .457 181 8000 86 

3911 .145 393 8500 65 

1412.188 301 9000 97 

1911.695 346 9500 76 
| 5391.372 236 10000 80 
5893 .297 458 

TABLE II. 


BARKLEY ROSSER. 


€ € 
0.0381 0.00511 
0.0321 3 0.00467 
0.0199 0.00427 
0.0179 650 é 0.00392 
0.0119 700 0.00356 
0.0101 800 0.00294 
0.00983 900 0.00235 
0.00938 1000 0.00194 
0.00932 1300 0.00104 
0.00710 2000 0.000383 
0.00609 2300 0.000255 
0.00567 3000 0.000158 


60 
80 
90 
100 
300 
400 
450 


t 


PRINCETON UNIVERSITY. 


BIBLIOGRAPHY. 


1. J. P. Gram, “ Maengden af Primtal under en given Graense,” K. Danske 
Vidensk. Selskabs Skrifter, ser. 6, vol. 2 (1881-86), pp. 183-308. 

2. G. W. Jones, Logarithmic Tables, Macmillan and Co., Seventh Edition, 1898. 

3. D. N. Lehmer, List of Prime Numbers from 1 to 10,006,721, Carnegie Institu- 
tion of Washington Publication No. 165, 1914. 

4. A. E. Ingham, The Distribution of Prime Numbers, Cambridge Tract No. 30, 
1932, London. 

5. J. B. Rosser, “The n-th prime is greater than nlogn,” Proceedings of the 
London Mathematical Society, ser. 2, vol. 45 (1939), pp. 21-44. 

6. E. Landau, Handbuch der Lehre von der Verteilung der Primzahlen. 

7. J. I. Hutchinson, “On the roots of the Riemann zeta function,” Transactions 
of the American Mathematical Society, vol. 27 (1925), pp. 49-60. 

8. E. C. Titchmarsh, “ The zeros of the Riemann zeta-function,”’ Proceedings of the 
Royal Society of London, Series A, vol. 151 (1935), pp. 234-255. 

9. E. C. Titchmarsh, “ The zeros of the Riemann zeta-function,” Proceedings of the 
Royal Society of London, Series A, vol. 157 (1936), pp. 261-263. 

10. R. J. Backlund, “ Uber die Nullstellen der Riemannschen Zetafunktion,” Acta 
Mathematica, vol. 41 (1917), pp. 345-375. 

ll. J. P. Gram, “Note Sur les Zéros de la Fonction ¢(s) de Riemann,” Acta 


Mathematica, vol. 27 (1903), pp. 289-304. 


232 

b m 

13.8 2 

15 2 

20 2 
30 
40 


